How to #2: Load Model from Endpoint for Inference with AWS

Adam Davis
3 min readNov 6, 2022
Over the Town — Marc Chagall

“My name is Marc, my emotional life is sensitive and my purse is empty, but they say I have talent” — Marc Chagall

I’m with you on that one Marc. Let’s get started.

In part one of How To I went through how to construct a basic binary classification model from start to finish along with hyperparameter tuning. It ended with deployment. Now what? The next step is to actually take it to production which is a fancy way to be able to use the model, especially when it’s not in the same exact notebook in which it was just created. I’m going to show how to invoke an endpoint and be able to have predictions as output to the specified bucket we have created.

Import the libraries needed:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import metrics
from sklearn.metrics import auc, accuracy_score, confusion_matrix
from sklearn.metrics import accuracy_score,classification_report
import seaborn as sns

import os
import io
import boto3
import json
import csv
from io import StringIO
import sagemaker

Json is needed as input and output is read in that format for AWS. Pandas can handle this easily.

Load Endpoint:

sagemaker = boto3.client('sagemaker')
ENDPOINT_NAME = 'xgb-linsearch***********'
runtime= boto3.client('runtime.sagemaker')
bucket = '**********'
s3 = boto3.client('s3')
key = 'test_set.csv'

We know before time what the endpoint and bucket are and are up to the user to create. Now to use the endpoint to make predictions off of the “test_set”.

Reading Input for the Endpoint:

response = s3.get_object(Bucket=bucket, Key=key)
content = response['Body'].read().decode('utf-8')
results = []
for line in content.splitlines():
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
ContentType='text/csv',
Body=line)
result = json.loads(response['Body'].read().decode())
results.append(result)
i = 0
multiLine = ""
for item in results:
if (i > 0):
multiLine = multiLine + '\n'
multiLine = multiLine + str(item)
i+=1

file_name = "predictions.csv"
s3_resource = boto3.resource('s3')
s3_resource.Object(bucket, file_name).put(Body=multiLine)

This code lends itself well to be used as a template. A set of predictions are formed from the model that we have trained and outputs that to your selected bucket.

Accuracy:

cm = metrics.confusion_matrix(test ,np.round(predictions_array))
class_names=[0,1] # name of classes
fig, ax = plt.subplots()
tick_marks = np.arange(len(class_names))
plt.xticks(tick_marks, class_names)
plt.yticks(tick_marks, class_names)
# create heatmap
sns.heatmap(pd.DataFrame(cm), annot=True, cmap="YlGnBu" ,fmt='g')
ax.xaxis.set_label_position("top")
plt.tight_layout()
plt.title('Confusion matrix', y=1.1)
plt.ylabel('Actual label')
plt.xlabel('Predicted label')

Great! Lets look a little closer:

print(classification_report(test , np.round(predictions_array)))

Still seems pretty good as far as performance goes. Let’s not forget to clean up and delete both the endpoint as well as the bucket contents from the inference:

sagemaker.delete_endpoint(EndpointName=ENDPOINT_NAME)
bucket_to_delete = boto3.resource('s3').Bucket(bucket)
bucket_to_delete.objects.all().delete()

There we go. Wasn’t too bad. Next will be how to set up an API for some sort of input. I’m thinking for a possible sudoku solver website?

Thanks

--

--