Monitoring a SageMaker-Deployed Model in DataRobot MLOps

Showing results for 
Search instead for 
Did you mean: 

We're looking into an issue with broken attachments right now. Please stay tuned!

Monitoring a SageMaker-Deployed Model in DataRobot MLOps

This article will outline how to monitor an AWS SageMaker model developed and deployed on AWS for real-time API scoring.  DataRobot will monitor the model through a remote agent architecture, which does not require a direct connection between the AWS model and DataRobot. The scope of this article will cover populating data into a monitoring queue; however constructing and consuming data from a queue is covered in a separate article: Model Monitoring with Serverless MLOps Agents.

Technical Architecture

Screen Shot 2020-12-05 at 9.08.24 PM.png

The deployment architecture above will be constructed in the following sections.  Components are explained in more detail below.

  1. An API client will assemble a single line JSON request of raw data input for scoring, which will be posted to an API Gateway-exposed endpoint.
  2. API Gateway will simply act as a pass-through and submit the request to an associated Lambda function for handling.
  3. Logic in the Lambda will process the raw input data and parse it into the format required to score through the SageMaker endpoint, a headerless CSV in this case for an XGBoost model. The SageMaker endpoint will be invoked.
  4. The SageMaker endpoint will satisfy the request by passing it to a standing deployed EC2 instance hosting the real-time model. The model deployment line from the AWS code in Community GitHub (xgb.deploy) takes care of standing up this machine and bringing the trained AWS ECR hosted model to it.
  5. The raw score is processed by the Lambda; in this use case, a threshold is applied to select a binary classification label.
  6. Timing, input data, and model results are written to an SQS queue.
  7. The processed response is sent back to API Gateway.
  8. The processed response is passed back to the client.

Out of scope for this article is setting up the monitoring agent architecture to consume the SQS queue and report data back to DataRobot. There are multiple examples of this in other community articles, such as Model Monitoring with Serverless MLOps Agents.

Create a Custom SageMaker Model

This article is based on the SageMaker notebook example in the AWS GitHub (located here).  The use case is to predict which customers will respond positively to a direct marketing campaign. The code has been updated to confirm with the v2 version of the SageMaker SDK and can be found in the DataRobot Community GitHub here.

Completion of the notebook in AWS SageMaker will result in a deployed model at a SageMaker endpoint named xgboost-direct-marketing hosted on a standing ml.m4.xlarge instance.  Note the endpoint is expecting fully prepared and preprocessed data (one hot encoding applied, for example) in the same order it was provided in during training.  There are several ways to test the SageMaker endpoint; following is a short Python script that can score a record from the validation set. (The target column has been dropped.)


import boto3
import os
import json
runtime = boto3.Session().client('sagemaker-runtime',use_ssl=True)
endpoint_name = 'xgboost-direct-marketing'

payload = '29,2,999,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,1,0'

response = runtime.invoke_endpoint(EndpointName=endpoint_name, 
result = json.loads(response['Body'].read())


Create an External Deployment in DataRobot

Inside DataRobot, a deployment entry must be created to monitor the SageMaker model. Data and statistics will be reported to this deployment for processing, visualization, and analysis. To do this, navigate to Model Registry > Model Packages tab. Click Add New Package, and select New external model package.


Fill out information in the form as shown below. Upload the bank-additional-full.csv file (i.e., downloaded from the Community GitHub code and then extracted) as the training dataset. Although this is an example, for sake of completeness it’s important to note that the model in the GitHub example is not trained up to 100% of the data before full deployment; in real-world ML, that would be a good practice to consider.

Screen Shot 2020-12-05 at 10.25.49 PM.png

CCick Create package to complete creation of this model package and add it to the Model Registry. The next step is to create a deployment for it, which can be done from the Actions menu for the package (as shown below).

Screen Shot 2020-12-05 at 10.31.21 PM.png

Toggle on drift tracking and select Create deployment.

Screen Shot 2020-12-05 at 10.33.47 PM.png

Additional prediction environment metadata may also be configured and specified if needed.  This will capture information around where the externally deployed model resides, e.g., AWS in this case.

Screen Shot 2021-03-30 at 2.40.48 AM.png

Upon completion of deployment creation, some ID values need to be retrieved.  These will be associated with the model in SageMaker.  Under the deployment, navigate to the PredictionsMonitoring tab, and view the Monitoring Code. Copy the values for MLOPS_DEPLOYMENT_ID and MLOPS_MODEL_ID.

Note that the MLOPS_DEPLOYMENT_ID is associated with the entry within model monitoring, while the MLOPS_MODEL_ID is an identifier provided for the actual scoring model behind it. The MLOPS_DEPLOYMENT_ID can be expected to stay static; however, a user may replace the SageMaker model at some point. If this is done, one of two actions should be taken: either create a completely new external deployment in DataRobot following the same steps as above, or register a new model package and replace the model currently hosted at this deployment with that new package. In this scenario a new MLOPS_MODEL_ID will be assigned, and will be used to update the Lambda environment variables. In addition, the same MLOPS_DEPLOYMENT_ID entry in DataRobot will show statistics for both models under the same entry, and note when the change occurred.

Create an IAM Role for Lambda to Execute the SageMaker Endpoint

A role will be used by Lambda function to score data through the SageMaker endpoint.  Navigate to the IAM Service within the AWS Console.  Hit Create Role and choose "Lambda" for use case, and then Next: Permissions.  Select Create Policy, choose the JSON tab, and paste in the following snippet:


    "Version": "2012-10-17",
    "Statement": [
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "sagemaker:InvokeEndpoint",
            "Resource": "*"


Next select Review policy, and name it lambda_sagemaker_execution_policy, then click Create policy. The policy can now be attached to the role in the previous workflow. To do this, select the refresh button and filter on the string "sagemaker." Select this newly created policy from the list. Click Next: Tags, set any desired, and then hit Next: Review. Name the role lambda_sagemaker_execution_role and hit Create role.

This role requires additional resources so that the Lambda can send reporting data to an SQS queue. An example of creating a queue to receive reporting data can be found in this article. The queue created in the article (sqs_mlops_data_queue) will be reused here.

To add the additional resources, view the IAM role lambda_sagemaker_execution_role, select Add inline policy, and perform a search for the SQS service. Select List, Read, and Write access levels; optionally, deselect ReceiveMessage under the Read heading so that this role may not pop items off the queue. Expand Resources to limit the role to just using the specific data queue, and populate the ARN of the queue

Screen Shot 2021-03-30 at 2.54.58 AM.png

Hit Review policy, name it lambda_agent_sqs_write_policy, and finalize by clicking Create policy.

Additional privileges are required to allow the Lambda to write log entries to CloudWatch. Select Attach policies, filter on AWSLambdaBasicExecutionRole, select that privilege, and click Attach policy. The completed permissions for the role should look similar to below.

Screen Shot 2021-03-30 at 2.48.43 AM.png

Create a Layer for the Lambda

Lambda allows for layers, which are additional libraries that can be used by Lambda at runtime. You need to download the MLOps agent library from the DataRobot UI, under the Profile > Developer Tools menu. The package used for this article (available at the time of this writing) is datarobot_mlops_package-6.3.3-488; it includes numpy and pandas as well. These packages were used in data prep for the model, and the same code will be reused in the Lambda function.

The Lambda environment will be Python 3.7 on Amazon Linux. To ensure a layer will work with the Lambda, you can first create one on a small Amazon Linux EC2 instance. Instructions to install Python 3 on Amazon Linux are available here. Once the model package is on the server, perform the following steps.


gunzip datarobot_mlops_package-6.3.3-488.tar.gz
tar -xvf datarobot_mlops_package-6.3.3-488.tar
cd datarobot_mlops_package-6.3.3
python3 -m venv my_agent/env
source my_agent/env/bin/activate
pip install lib/datarobot_mlops-*-py2.py3-none-any.whl
cd my_agent/env
mkdir -p python/lib/python3.7/site-packages
cp -r lib/python3.7/site-packages/* python/lib/python3.7/site-packages/.
zip -r9 ../ python
cd ..
aws support s3 cp s3://some-bucket/layers/


In AWS, navigate to Lambda -> Additional resources -> Layers, then select Create layer.  Name the layer python37_agent633_488 and (optionally) choose a Python 3.7 runtime.  Select "Upload a file from S3" and provide the S3 address of the file, s3://some-bucket/layers/ Select Create layer to save out.

Create a Lambda that calls the SageMaker Runtime Invoke_Endpoint

The SageMaker endpoint accepts raw ready to score data; however, this is not very friendly to API clients. A record ready to be scored looks like the following. 


Another example can be found on AWS here.  This puts the onus of data prep on the client application, and does not make for a friendly API, nor for an API that has captured raw data to be monitored for drift over time.

A Lambda will be created to process the actual data used by a client to make it ready for scoring. The returned score will be decoded as well. making it much friendlier for calling applications. Navigate to the AWS Lambda service in the console. Choose to Create function, select Author from scratch, pick the Python 3.7 runtime, and under Permissions choose the default execution role to be the prior-created lambda_sagemaker_execution_role. Name the function lambda-direct-marketing and hit Create function

On the next screen, choose to edit the environment variables. Create the following, replacing values as appropriate for the DataRobot MLOPS_DEPLOYMENT_ID and MLOPS_MODEL_ID. Also provide the URL for the AWS SQS queue to use as a reporting channel.

ENDPOINT_NAME xgboost-direct-marketing

The Lambda designer window also has a spot for selecting Layers. Choose this box and then select Add a layer from the layers form.  Select Custom layers and choose the created layer. Only layers that have a runtime matching the Lambda runtime will show up in this list, although a layer can be explicitly chosen by ARN if you opt to specify one.

Use the following code for the Lambda body.


import os
import io
import boto3
import json
import csv
import time
import pandas as pd
import numpy as np
from datarobot.mlops.mlops import MLOps

# grab environment variables
runtime= boto3.client('runtime.sagemaker')

def lambda_handler(event, context):
    # this is designed to work with only one record, supplied as json
    # start the clock
    start_time = time.time()
    # parse input data
    print("Received event: " + json.dumps(event, indent=2))
    parsed_event = json.loads(json.dumps(event))
    payload_data = parsed_event['data']
    data = pd.DataFrame(payload_data, index=[0])
    input_data = data

    # repeat data steps from training notebook
    data['no_previous_contact'] = np.where(data['pdays'] == 999, 1, 0)                                 # Indicator variable to capture when pdays takes a value of 999
    data['not_working'] = np.where(np.in1d(data['job'], ['student', 'retired', 'unemployed']), 1, 0)   # Indicator for individuals not actively employed
    model_data = pd.get_dummies(data)     
    model_data = model_data.drop(['duration', 'emp.var.rate', 'cons.price.idx', 'cons.conf.idx', 'euribor3m', 'nr.employed'], axis=1)
    # xgb sagemaker endpoint features
    # order/type required as was deployed in sagemaker notebook
    model_features = ['age', 'campaign', 'pdays', 'previous', 'no_previous_contact',
       'not_working', 'job_admin.', 'job_blue-collar', 'job_entrepreneur',
       'job_housemaid', 'job_management', 'job_retired', 'job_self-employed',
       'job_services', 'job_student', 'job_technician', 'job_unemployed',
       'job_unknown', 'marital_divorced', 'marital_married', 'marital_single',
       'marital_unknown', 'education_basic.4y', 'education_basic.6y',
       'education_basic.9y', '', 'education_illiterate',
       'education_professional.course', '',
       'education_unknown', 'default_no', 'default_unknown', 'default_yes',
       'housing_no', 'housing_unknown', 'housing_yes', 'loan_no',
       'loan_unknown', 'loan_yes', 'contact_cellular', 'contact_telephone',
       'month_apr', 'month_aug', 'month_dec', 'month_jul', 'month_jun',
       'month_mar', 'month_may', 'month_nov', 'month_oct', 'month_sep',
       'day_of_week_fri', 'day_of_week_mon', 'day_of_week_thu',
       'day_of_week_tue', 'day_of_week_wed', 'poutcome_failure',
       'poutcome_nonexistent', 'poutcome_success']
    # create base generic single row to score with defaults
    feature_dict = { i : 0 for i in model_features }
    feature_dict['pdays'] = 999
    # get column values from received and processed data
    input_features = model_data.columns
    # replace value in to be scored record, if input data provided a value
    for feature in input_features: 
        if feature in feature_dict: 
            feature_dict[feature]  = model_data[feature] 
    # make a csv string to score
    payload = pd.DataFrame(feature_dict).to_csv(header=None, index=False).strip('\n').split('\n')[0]
    print("payload is:" + str(payload))
    # stamp for data prep
    prep_time = time.time()
    print('data prep took: ' + str(round((prep_time - start_time) * 1000, 1)) + 'ms')
    response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
    # process returned data
    pred = json.loads(response['Body'].read().decode())
    #pred = int(result['predictions'][0]['score'])
    # if scored value is > 0.5, then return a 'yes' that the client will subscribe to a term deposit
    predicted_label = 'yes' if pred >= 0.5 else 'no'
    # initialize mlops monitor
    m = MLOps().init()

    # MLOPS: report test features and predictions and association_ids
        , class_names = ['yes', 'no']
        , predictions = [[pred, 1-pred]]    # yes, no
    # report lambda timings (excluding lambda startup and imports...)
    # MLOPS: report deployment metrics: number of predictions and execution time
    end_time = time.time()
    m.report_deployment_stats(1, (end_time - start_time) * 1000)
    print("pred is: " + str(pred))
    print("label is: " + str(predicted_label))
    return predicted_label


Test the Lambda

Click Configure test events in the upper right of the Lambda screen to configure a test JSON record. Use the following JSON record format:


  "data": {
    "age": 56,
    "job": "housemaid",
    "marital": "married",
    "education": "basic.4y",
    "default": "no",
    "housing": "no",
    "loan": "no",
    "contact": "telephone",
    "month": "may",
    "day_of_week": "mon",
    "duration": 261,
    "campaign": 1,
    "pdays": 999,
    "previous": 0,
    "poutcome": "nonexistent",
    "emp.var.rate": 1.1,
    "cons.price.idx": 93.994,
    "cons.conf.idx": -36.4,
    "euribor3m": 4.857,
    "nr.employed": 5191


Select Test to score a record through the Lambda service and SageMaker endpoint.

Lambda Resource Settings and Performance Implications

Serverless computational resources are allocated from 128MB to 10240MB, which can be changed on the Lambda console, under Basic settings. This results in the allocation of a partial vCPU to six full vCPUs during each Lambda run. Lambda cold and warm starts and the EC2 host sizing/scaling for the SageMaker endpoint are beyond the scope of this article, but the resources for the Lambda itself impacts pre/post scoring processing and overall Lambda performance.

128MB for this code will produce noticeably slower processing times, although diminishing returns are to be expected as RAM and CPO are upsized. For this example, 1706MB (and one full vCPU) provided good results.

Expose the Lambda via API Gateway

NNavigate to the API Gateway service in AWS and click Create API. Choose to build a REST API, name it lambda-direct-marketing-api, and then click Create API again. Under the Resources section of the entry, choose Actions -> Create Resource, name it predict and select Create Resource. Highlight the resource, choose Actions -> Create Methods, and select a POST method. Choose the Integration Type “Lambda Function” and the Lambda Function “lambda-direct-marketing”, and click Save.

Screen Shot 2020-12-08 at 1.47.35 AM.png

Screen Shot 2020-12-08 at 1.56.19 AM.png

You can select the TEST button on the client if the same payload that was used in the Lambda test event (see the section “Test the Lambda”). Next, choose Actions -> Deploy API, choose a Stage name such as test, and select Deploy.

Test the Exposed API

The model is now deployed and available via the Invoke URL provided after deployment. The same test record used above (in the section “Test the Lambda”) can be used to score the model via an HTTP request. Below is an example of doing so using curl and an inline JSON record. . 

Expected no:

curl -X POST "" --data '{"data": {"age": 56, "job": "housemaid", "marital": "married", "education": "basic.4y", "default": "no", "housing": "no", "loan": "no", "contact": "telephone", "month": "may", "day_of_week": "mon", "duration": 261, "campaign": 1, "pdays": 999, "previous": 0, "poutcome": "nonexistent", "emp.var.rate": 1.1, "cons.price.idx": 93.994, "cons.conf.idx": -36.4, "euribor3m": 4.857, "nr.employed": 5191}}'

Expected yes: 

curl -X POST "" --data '{"data": {"age": 34, "job": "blue-collar", "marital": "married", "education": "", "default": "no", "housing": "yes", "loan": "no", "contact": "cellular", "month": "may", "day_of_week": "tue", "duration": 863, "campaign": 1, "pdays": 3, "previous": 2, "poutcome": "success", "emp.var.rate": -1.8, "cons.price.idx": 92.893, "cons.conf.idx": -46.2, "euribor3m": 1.344, "nr.employed": 5099.1}}'

Review and Monitor the Deployment in DataRobot

Once data is reported from the data queue back to DataRobot, the external model will contain metrics relevant to the model and its predictions. You can select the deployment from the DataRobot UI to view operational service health:

Screen Shot 2021-01-18 at 2.36.44 AM.png

and data drift metrics that compare data in scoring requests to that of the original training set.

Screen Shot 2021-01-18 at 2.37.41 AM.png


Not only can DataRobot be used to build, host, and monitor its own models—with either its own resources or deployed elsewhere—but, as shown here, it can also be used to monitor completely custom models created and hosted on external architecture. In addition to service health and drift tracking statistics of unprocessed features, models with association IDs and actual results can be used to track model accuracy as well; see Measuring Prediction Accuracy: Uploading Actual Results for more.

Labels (1)
Version history
Last update:
2 weeks ago
Updated by: