Learning to deploy pre-trained model onto AWS Sagemaker
Deep learning algorithms have proven being effective in tackling complex challenges. As engineers, we are increasingly focused on integrating these models into production systems. Often, we need to build intelligent AI systems accessible from any location and client platform. A cloud-based approach, compared to on-premise solutions, is designed to facilitate exactly this kind of flexibility.
Currently, AWS is the largest provider of cloud service solutions. As a complete beginner in cloud systems, I decided to start my journey here. With my background in computer vision, I was particularly interested in finding a straightforward way to deploy a pre-trained image segmentation model on SageMaker. In today’s post (consisting of two parts), I’ll outline a simple pipeline that shows how I set up a SageMaker endpoint and used it to generate a mask for a query image, as illustrated below:
Note
I mostly referred to this tutorial notebook, excluding the finetuning part. After trying to replicate it locally, I kept receiving ClientError: An error occurred (404) when calling the HeadObject operation: Not Found
. This bug is also reported in Issue #4478, and hope it will be resolved soon. Anyway, I found a walkaround for this issue by creating model, sagemaker.model Model
, and endpoint sagemaker.model Model.deploy
, using AWS console.
Part 1. Create endpoint
In this section, we will review the instructions on how to create a model and configure the endpoint to handle queries.
1) Visit the GluonCV DeepLab Semantic Segmentation page. This model generates a mask that describes the category for each pixel in the input image, as outlined in the product overview. The model is free; you only need to pay for the cost of the running endpoint, which may vary depending on the region. Additionally, there are many other available models in the AWS Marketplace.
2) After reviewing basic information, press Continue to subscribe and then Continue to configuration
3) On the configuration page, there are three “Available launch methods” for interacting with the model. I personally prefer the “SageMaker console” due to its simplicity. After confirming the region and the “Amazon SageMaker options,” we can proceed by selecting View in Amazon SageMaker.
4) We will then be redirected to the pages related to creating a model. In the required fields, provide a “Model name” (this can be any name) and an “IAM role” (not to be confused with an IAM user). For more information on IAM roles, refer to this essential reading. You can leave the default settings for the remaining fields and click Next.
5) On the next page, enter an “Endpoint name” (this can be any name). This page will also prompt you to enter an “Endpoint configuration”. If no endpoint configurations exist, you can create a new one by clicking Create endpoint configuration. After the endpoint configuration is created, click Submit.
A few minutes later, endpoint will be created and ready for service. This can be confirmed from the “Inference” → “Endpoints” tab of “Amazon SageMaker” menu.
Part 2. Invoke endpoint
In this section, we will send a query using the SageMaker API and process the response from the endpoint.
1) First, let’s download a sample image from Amazon S3 to a local directory using the AWS SDK for Python (Boto3).
import sagemaker, boto3
from PIL import Image
def get_image():
aws_region = "us-east-1"
jumpstart_assets_s3_bucket = f"jumpstart-cache-prod-{aws_region}"
pedestrian_img_key_prefix = "inference-notebook-assets"
img_fname = "img_pedestrian.png"
boto3.client("s3").download_file(
jumpstart_assets_s3_bucket, f"{pedestrian_img_key_prefix}/{img_fname}", img_fname
)
img = Image.open(img_fname)
return img, img_fname
img, img_fname = get_image()
2) Next we use Predictor. Here, we need to specify the endpoint name that we previously defined in the AWS console.
from sagemaker.predictor import Predictor
predictor = Predictor(endpoint_name="endpoint-segment")
3) As described in the usage information of GluonCV DeepLab Semantic Segmentation model, it supports requests of the types: image/jpeg
, image/png
and image/bmp
(MIME type). So, since input image is of the type .png
, I modified the ContentType
field. Accept
is the expected content type from the inference endpoint. The model’s result is contained in the prediction
key of the json response, which we need to extract.
import json
def query(model_predictor, image_file_name):
"""Query the model predictor."""
with open(image_file_name, "rb") as file:
input_img_rb = file.read()
query_response = model_predictor.predict(
input_img_rb,
{
"ContentType": "image/png",
"Accept": "application/json;verbose",
},
)
return query_response
def parse_response(query_response):
"""Parse response and return predictions as well as the set of all labels and object labels present in the image."""
response_dict = json.loads(query_response)
return response_dict["prediction"]
query_response = query(predictor, img_fname)
prediction = parse_response(query_response)
4) The following function, borrowed from the notebook, transforms the response into a numpy array. Each class in the mask is assigned a unique color, which will be used for visualization.
import numpy as np
def getvocpalette(num_cls):
"""Get a color palette."""
n = num_cls
palette = [0] * (n * 3)
for j in range(0, n):
lab = j
palette[j * 3 + 0] = 0
palette[j * 3 + 1] = 0
palette[j * 3 + 2] = 0
i = 0
while lab > 0:
palette[j * 3 + 0] |= ((lab >> 0) & 1) << (7 - i)
palette[j * 3 + 1] |= ((lab >> 1) & 1) << (7 - i)
palette[j * 3 + 2] |= ((lab >> 2) & 1) << (7 - i)
i = i + 1
lab >>= 3
return palette
def get_mask(predictions):
"""Display predictions with each pixel subsituted by the color of the corresponding label."""
palette = getvocpalette(150)
npimg = np.array(predictions)
npimg[npimg == -1] = 255
mask = Image.fromarray(npimg.astype("uint8"))
mask.putpalette(palette)
return mask
mask = get_mask(prediction)
5) Finally, we can visualize the mask alongside the input image.
import matplotlib.pyplot as plt
def plot_img(img, mask):
fig = plt.figure(figsize=(10, 7))
rows, columns = 1, 2
fig.add_subplot(rows, columns, 1)
plt.imshow(img)
plt.axis('off')
plt.title("Query image")
fig.add_subplot(rows, columns, 2)
plt.imshow(mask)
plt.axis('off')
plt.title("Response from SageMaker Endpoint")
plot_img(img, mask)