Learning to deploy pre-trained model onto AWS Sagemaker

_{^{6 min read (656 words)}}

Deep learning algorithms have proven being effective in tackling complex challenges. As engineers, we are increasingly focused on integrating these models into production systems. Often, we need to build intelligent AI systems accessible from any location and client platform. A cloud-based approach, compared to on-premise solutions, is designed to facilitate exactly this kind of flexibility.

Currently, AWS is the largest provider of cloud service solutions. As a complete beginner in cloud systems, I decided to start my journey here. With my background in computer vision, I was particularly interested in finding a straightforward way to deploy a pre-trained image segmentation model on SageMaker. In today’s post (consisting of two parts), I’ll outline a simple pipeline that shows how I set up a SageMaker endpoint and used it to generate a mask for a query image, as illustrated below:

Note

I mostly referred to this tutorial notebook, excluding the finetuning part. After trying to replicate it locally, I kept receiving ClientError: An error occurred (404) when calling the HeadObject operation: Not Found. This bug is also reported in Issue #4478, and hope it will be resolved soon. Anyway, I found a walkaround for this issue by creating model, sagemaker.model Model, and endpoint sagemaker.model Model.deploy, using AWS console.

Part 1. Create endpoint

In this section, we will review the instructions on how to create a model and configure the endpoint to handle queries.

1) Visit the GluonCV DeepLab Semantic Segmentation page. This model generates a mask that describes the category for each pixel in the input image, as outlined in the product overview. The model is free; you only need to pay for the cost of the running endpoint, which may vary depending on the region. Additionally, there are many other available models in the AWS Marketplace.
2) After reviewing basic information, press Continue to subscribe and then Continue to configuration
3) On the configuration page, there are three “Available launch methods” for interacting with the model. I personally prefer the “SageMaker console” due to its simplicity. After confirming the region and the “Amazon SageMaker options,” we can proceed by selecting View in Amazon SageMaker.

4) We will then be redirected to the pages related to creating a model. In the required fields, provide a “Model name” (this can be any name) and an “IAM role” (not to be confused with an IAM user). For more information on IAM roles, refer to this essential reading. You can leave the default settings for the remaining fields and click Next.

5) On the next page, enter an “Endpoint name” (this can be any name). This page will also prompt you to enter an “Endpoint configuration”. If no endpoint configurations exist, you can create a new one by clicking Create endpoint configuration. After the endpoint configuration is created, click Submit.

A few minutes later, endpoint will be created and ready for service. This can be confirmed from the “Inference” → “Endpoints” tab of “Amazon SageMaker” menu.

Part 2. Invoke endpoint

In this section, we will send a query using the SageMaker API and process the response from the endpoint.

1) First, let’s download a sample image from Amazon S3 to a local directory using the AWS SDK for Python (Boto3).

import sagemaker, boto3
from PIL import Image

def get_image():
    aws_region = "us-east-1"
    jumpstart_assets_s3_bucket = f"jumpstart-cache-prod-{aws_region}"
    pedestrian_img_key_prefix = "inference-notebook-assets"
    img_fname = "img_pedestrian.png"

    boto3.client("s3").download_file(
        jumpstart_assets_s3_bucket, f"{pedestrian_img_key_prefix}/{img_fname}", img_fname
    )

    img = Image.open(img_fname)
    return img, img_fname

img, img_fname = get_image()

2) Next we use Predictor. Here, we need to specify the endpoint name that we previously defined in the AWS console.

from sagemaker.predictor import Predictor

predictor = Predictor(endpoint_name="endpoint-segment")

3) As described in the usage information of GluonCV DeepLab Semantic Segmentation model, it supports requests of the types: image/jpeg, image/png and image/bmp (MIME type). So, since input image is of the type .png, I modified the ContentType field. Accept is the expected content type from the inference endpoint. The model’s result is contained in the prediction key of the json response, which we need to extract.

import json

def query(model_predictor, image_file_name):
    """Query the model predictor."""

    with open(image_file_name, "rb") as file:
        input_img_rb = file.read()

    query_response = model_predictor.predict(
        input_img_rb,
        {
            "ContentType": "image/png",
            "Accept": "application/json;verbose",
        },
    )
    return query_response

def parse_response(query_response):
    """Parse response and return predictions as well as the set of all labels and object labels present in the image."""
    response_dict = json.loads(query_response)
    return response_dict["prediction"]

query_response = query(predictor, img_fname)
prediction = parse_response(query_response)

4) The following function, borrowed from the notebook, transforms the response into a numpy array. Each class in the mask is assigned a unique color, which will be used for visualization.

import numpy as np

def getvocpalette(num_cls):
    """Get a color palette."""

    n = num_cls
    palette = [0] * (n * 3)
    for j in range(0, n):
        lab = j
        palette[j * 3 + 0] = 0
        palette[j * 3 + 1] = 0
        palette[j * 3 + 2] = 0
        i = 0
        while lab > 0:
            palette[j * 3 + 0] |= ((lab >> 0) & 1) << (7 - i)
            palette[j * 3 + 1] |= ((lab >> 1) & 1) << (7 - i)
            palette[j * 3 + 2] |= ((lab >> 2) & 1) << (7 - i)
            i = i + 1
            lab >>= 3
    return palette

def get_mask(predictions):
    """Display predictions with each pixel subsituted by the color of the corresponding label."""

    palette = getvocpalette(150)
    npimg = np.array(predictions)
    npimg[npimg == -1] = 255
    mask = Image.fromarray(npimg.astype("uint8"))

    mask.putpalette(palette)
    return mask

mask = get_mask(prediction)

5) Finally, we can visualize the mask alongside the input image.

import matplotlib.pyplot as plt

def plot_img(img, mask):
    fig = plt.figure(figsize=(10, 7)) 
    rows, columns = 1, 2

    fig.add_subplot(rows, columns, 1)
    plt.imshow(img)
    plt.axis('off')
    plt.title("Query image")

    fig.add_subplot(rows, columns, 2)
    plt.imshow(mask)
    plt.axis('off')
    plt.title("Response from SageMaker Endpoint")
    
plot_img(img, mask)