Machine Learning Operations (MLOps) Walkthrough

Showing results for 
Search instead for 
Did you mean: 

Machine Learning Operations (MLOps) Walkthrough


With DataRobot Machine Learning Operations (MLOps), you have a central hub to deploy, monitor, manage, and govern machine learning models in production to maximize your investments in data science teams and to manage risk and regulatory compliance.

Use Case

In this article we're going to present a simple step-by-step guide to using DataRobot MLOps through a typical lifecycle. We start with a quick tour of the main pages where you will be spending most of your time as you utilize MLOps to monitor and manage your deployed models. Then, we present the steps for getting your models and data into MLOps.

We begin by uploading a model into MLOps as a Model Package into the Model Registry. Then we’ll see how to create a deployment from your model package. Next we show how to monitor incoming data for changes across all the model feature variables over time using data drift, along with assessing the model performance by comparing predictions made to actual outcomes. This includes a step to upload the actual results so that accuracy can be tracked. From there we show how to replace a model when its performance degrades. And finally, we discuss leveraging a process control framework for your model development and implementation workflows using MLOps Governance.

Model Deployment, Monitoring, Management and Governance

Step 1: The Deployment Dashboard and Deployment Details Pages

The Deployments dashboard is the first page you land when you access the MLOps user interface. It presents an inventory of all of your deployments.

Figure 1. The Deployment DashboardFigure 1. The Deployment Dashboard

By deployment we are referring to the model you have deployed and is available for scoring or inference requests. The deployment is a separate entity from the model; you can replace the model with a newer model version without disrupting the way you access it to get predictions, since you access the model through the deployment. This also allows MLOps to monitor each underlying model version separately and keep track of the historical lineage of models for the deployment.

On the Deployments dashboard, across the top of the inventory, a summary of the usage and status of all active deployments is displayed, with color-coded health indicators.

Figure 2. Dashboard SummaryFigure 2. Dashboard Summary

Beneath the summary is an individual report for each deployment. Next to the name of the deployment is the relative status of each deployment across three core monitoring dimensions: Service Health, Data Drift, and Accuracy. In addition, the columns displayed can be switched to show the deployments with information relevant to the governance perspective, vs the prediction health information. These views are referred to as “lenses.”

Figure 3. Prediction Health lensFigure 3. Prediction Health lens

Figure 4. Governance lensFigure 4. Governance lens

A few metrics on prediction activity traffic are shown as well as a menu of options available to manage the model.

To view all of this information in detail, select the deployment you want to view; you will land on the Overview page, the first of several deployment details pages that provide features for monitoring and managing the deployment.

Figure 5. Deployment Overview pageFigure 5. Deployment Overview page

The deployment Overview page provides a model-specific summary that describes the deployment, including the information you supplied when creating the deployment and any model replacement activity.

  • Summary lists the user-supplied name and description entered when the deployment was added.
  • Content provides deployment-specific details, including the target specified, and model information that varies depending on type of deployment, such as the dataset used to create the model if available or information about a custom model.
  • And Governance provides organizations a way to implement an approval process for model management. This includes the create and deploy dates, along with a log of when a model is replaced.

The Service Health tab tracks metrics about a deployment’s ability to respond to prediction requests quickly and reliably. This helps identify any bottlenecks affecting speed and response time. It also helps you assess throughput and capacity, which is critical to proper resource provisioning in order to support good performance and latency levels.

Figure 6. Deployment Service Health pageFigure 6. Deployment Service Health page

Next is the Data Drift page. By leveraging training data (aka “learning data”) and prediction scoring data (aka “inference data”) that are added to your deployment, MLOps can assess data drift, which is a calculation of how the incoming data for predictions differed from the data used to train the model.

Figure 7. Deployment Data Drift pageFigure 7. Deployment Data Drift page

The Accuracy page shows you how accurate the predictions are for the model. Capturing the results for what actually occurred from the predictions your models make may be immediately apparent, or may take days, weeks, or even months to determine. In any case, once you have those predictions and you upload them, MLOps will associate the actual results with the predictions made and present the calculated accuracy for review and analysis.

Figure 8. Deployment Accuracy pageFigure 8. Deployment Accuracy page

Under Integrations, you’ll see a code sample in the Python language of the necessary lines of code needed to make an API call to DataRobot to score new data. In many cases, you can simply copy and paste this into your software program and in a matter of minutes you’re integrated and up and running with DataRobot and our API.

Figure 9. Deployment Integrations pageFigure 9. Deployment Integrations page

And lastly, Settings provides an interface to upload and configure datasets associated with the deployment and underlying model. Namely, this allows you to add data to a deployment, set up notification settings to monitor deployments with alerts, and enable prediction warnings.

Figure 10. Deployment Settings pageFigure 10. Deployment Settings page

Step 2: Understanding Model Packages and the Model Registry

Creating a deployment begins with creating a model package and uploading it into the Model Registry.

The Model Registry is the central hub for all your model packages, and a package contains a file, a set of files, and/or information about your model; this varies depending on the type of model being deployed. DataRobot MLOps is flexible to be able to work with:

  • Models built within DataRobot AutoML or DataRobot AutoTS.
  • Your own models, built outside of DataRobot and uploaded into MLOps.
  • Models built and operated outside of DataRobot and you use MLOps to monitor them.

In all three cases, you create a model package, and once the package is in the Model Registry, from there you can create a deployment.

Figure 11. The Model RegistryFigure 11. The Model Registry

The Model Registry provides you with a consistent deployment, replacement, and management experience, regardless of the type of model you have deployed. If the model built is in DataRobot AutoML or AutoTS, the model package can be automatically added to the Model Registry package list when the deployment gets created from the Leaderboard; otherwise, packages are added to the package list manually.

Creating a model package is a simple process. The following procedures walk through creating a model package and deployment for each of the three model types:

A key difference is that DataRobot models and custom models have prediction requests received and processed within MLOps through an API call, while external models handle predictions in an outside environment and then those predictions are transferred back to MLOps. An MLOps Agent—the software you install that communicates from your environment back to the MLOps environment—tracks the predictions transferred to MLOps. For this reason the code sample displayed in the Integrations page is different for the Agent software vs a DataRobot or custom model. However, in all three cases, you source a deployment from a Model Package, utilize MLOps to monitor the data drift and predictions the model makes, and manage the model just the same.

Creating a Model Package for a DataRobot Model

Figure 12. Example model package for a model built with DataRobot AutoMLFigure 12. Example model package for a model built with DataRobot AutoML

For a model built within DataRobot, navigate to the Leaderboard and click on the model you want to deploy. Then select Predict > Deploy. You have three deployment options available to you:

  • Create a package to upload to MLOps. Model packages provide portability for your models, allowing you to upload to a separate MLOps environment.
  • Deploy within the AutoML environment. This deploys your model in a single click to the local MLOps attached to the AutoML or AutoTS instance, where it is immediately available to receive prediction requests.
  • Generate a package that automatically gets posted in the Model Registry, and from there you manually deploy it.

In all three cases, a model package is created in the Model Registry.

Figure 13. Creating a deployment from the LeaderboardFigure 13. Creating a deployment from the Leaderboard

If you use option 1 and create a model package, then after you save the model package file to your file system, you upload it into the destination MLOps environment from the Model Registry.

Figure 14. Add new model package from fileFigure 14. Add new model package from file

Creating a Model Package for a Custom Model

Figure 15. Example model package for a custom modelFigure 15. Example model package for a custom model

MLOps allows you to bring your own pre-trained models into DataRobot and the MLOps environment. These models are called custom inference models; inference here means the model is implemented to service prediction requests. By uploading a custom inference model, you can specify the execution environment and which library versions are required to run and test it for readiness to accept prediction requests. Once it passes the test, you can either deploy it or add the package to the Model Registry (from where you can make any further edits and then deploy it when ready). DataRobot supports custom models built with a variety of coding languages, including Python, Scala, and Java.

Using custom models is beyond the scope of this document. DataRobot licensed customers can find more information in the in-app Platform Documentation, within the "Custom Model Workshop" section.

Figure 16. Custom Model WorkshopFigure 16. Custom Model Workshop

Creating a Model Package for an External Model

The MLOps agent allows you to monitor and manage external models, i.e., those running outside of DataRobot MLOps. With this functionality, predictions and information from these models can be reported as part of DataRobot MLOps deployments. You can use the same model management tools to monitor accuracy, data drift, prediction distribution, latency, etc., regardless of where the model is running.

To create a model package for an external model that is monitored by the MLOps agent, navigate to Model Registry > Model Packages. Click Add New Package and select New external model package.

Figure 17. Add external model packageFigure 17. Add external model package

In the resulting dialog box, complete the fields pertaining to the MLOPs Agent-monitored model from which you are retrieving statistics. (The agent software must be installed in your environment to act as a bridge between your model and the MLOps external model deployment. Complete information for setting up the agent is provided in other articles and from the documentation included with the MLOps agent tarball. If needed, search the in-app Platform Documentation for Integrations tab for information about the MLOps agent tarball.)

Figure 18. Add new model package fieldsFigure 18. Add new model package fields

Step 4: Creating a Deployment

Once the model package is in the Model Registry, you simply navigate to the menu at the far right of any model package and select Deploy.

Figure 19. Deploy from a model package on the Model RegistryFigure 19. Deploy from a model package on the Model Registry

On the following page you’ll enter the remaining information needed to track predictions and model accuracy.

Figure 20. New deployment pageFigure 20. New deployment page

The information you see in the Model section (such as name and target) has already been supplied from the contents of the model package file.

Likewise, we see in the Learning section that the data used to train the model is also already known; DataRobot stored the information from when it created the AutoML or AutoTS project.

The Inference section contains information about capturing predictions and we can see that it is only partially complete. DataRobot stores the incoming prediction data received via an API call at the URL endpoint provided. If your DataRobot instance is hosted on the Managed AI Cloud, the subdomain name will be derived from your account, and if you have an on-premise installation, your endpoint will be hosted at your domain.

Capturing the predictions allows DataRobot to assess how the nature of your incoming prediction data differs from your training data. To capture those differences, click Enable data drift tracking. Checking the button to perform segment analysis allows DataRobot to identify characteristics of the incoming data, such as the permission level of the user making the requests or the IP address where the request came from.

Figure 21. Enabling Data DriftFigure 21. Enabling Data Drift

But if you want to track the prediction accuracy, you need to be able to associate the predictions the model makes with the actual results. Commonly, the actual outcome isn’t known for days, weeks, or months later. We refer to these actual results simply as the “actuals.” Now you need an identifier to associate the predictions with the actuals. The Association ID uniquely identifies each prediction and appears in an extra column that is appended to the rows of the request data. When you upload the actuals dataset, you supply the Association ID and the actual value of what happened: this ties them together.

Figure 22. Association IDFigure 22. Association ID

Which brings us to the last section for the Actuals outcome. After the deployment is created and you acquire the actuals, you can click the Add Data link to upload them. Just follow a few more steps when Uploading Actuals.

Figure 23. Actuals add dataFigure 23. Actuals add data

All that’s left to do now is to give your deployment a name, click Create deployment, and indicate the level of importance for the deployment; this creates the new deployment. The deployed model is now ready to receive prediction requests and MLOps will start tracking the predictions. To find out more about the model Importance settings, have a look at the MLOps Governance capabilities (Step 9: Governance)

Figure 24. Create deploymentFigure 24. Create deployment

Figure 25. ImportanceFigure 25. Importance

As suggested above, there are some shortcuts for creating a deployment, depending on the type of model. For a DataRobot model, you can deploy it directly from the Leaderboard. For a custom model, you can deploy from the Custom Model Workshop.

Figure 26. Deploy a DataRobot model from the LeaderboardFigure 26. Deploy a DataRobot model from the Leaderboard

Figure 27.  Deploy a custom model from the Custom Model WorkshopFigure 27. Deploy a custom model from the Custom Model Workshop

Step 5: Uploading Actuals

You have a deployment and are making predictions, but now you want to see how well your model is performing. To do this, you need to upload the actual outcome data and associate it with the predictions that were made.

To track the model accuracy we need to first import the actuals data into the AI Catalog. The AI Catalog is your own dedicated storage resource and provides a centralized way to manage data sets from different data sources. We won't go into the many features it has, except to say that you will upload and store your actuals data here. To do so, select AI Catalog and click Add to Catalog. Then, select the source of your data to upload it, which in this case is a local file.

Figure 28. AI CatalogFigure 28. AI Catalog

Navigate back to the Deployments dashboard and select your deployment. Now, to return to the previous page where we created the deployment, click the Settings menu item, and we see the Actuals section is now enabled.

Click Add Data to locate the actuals data from the AI Catalog. From here you specify the following: the Actuals Response column (which holds your actual outcome results), the Association ID column to link back to the predictions made, an optional column name to keep a record of what action was taken given the result, and an optional column name with a timestamp if you want to keep track of when the actual values were obtained.

Figure 29. Actuals data entryFigure 29. Actuals data entry

Click Upload when you’re finished specifying this information. Click the Accuracy tab and you’ll see how the predictions perform in comparison to the actual outcomes.

Step 6: Monitoring Performance with Service Health, Data Drift, and Accuracy

Service Health tracks metrics about a deployment’s ability to respond to prediction requests quickly and reliably. This helps identify any bottlenecks affecting speed and response time. It also helps you assess throughput and capacity, which is critical to proper resource provisioning in order to support good performance and latency levels.

Figure 30. Service HealthFigure 30. Service Health

In the majority of cases, your models will degrade over time. The composition or type of data may change, or the way you collect and store it may change.

On the Accuracy page, we see the difference between the predictions made and the actual values (and in this case shown here, we can see in the image below that the model is fairly consistently under-predicting the actuals). Most likely, the degraded predictions are a result of a change in the composition of data.

Figure 31. AccuracyFigure 31. Accuracy

The Data Drift page shows you how the prediction data changes over time from the data you originally used to train the model. In the plot on the left, each green, yellow, or red dot represents a feature. The degree of feature importance is shown on the X-axis, and a calculation of the severity of data drift is on the Y-axis. In the plot on the right, we see the range of values of each selected feature, with the original training data in dark blue and more recent prediction data in light blue. Looking at a few examples, we can see how the composition of the data has changed.

Figure 32. Data DriftFigure 32. Data Drift

So inevitably you’ll want to retrain your model on the latest data and replace the model currently in the deployment with the new model. DataRobot MLOps makes this easy by providing a simple interface to swap out your model, all the while maintaining the lineage of models and all collected drift and accuracy data. And this occurs seamlessly, without any service disruption.

Step 7: Replacing a Model

To replace a model, you can select  actions menu on the far right of the DeploymentsDeployments List page and select Replace model. However, this option is only available to select if you are a deployment owner. You can also select Replace model from the same menu on the Deployment dashboard on the row for your deployment.

Then, simply point DataRobot to the model you want to use by uploading another model package file or referencing one in the Model Registry. DataRobot will do a check that the data types match and then prompt you to indicate a reason for the change, such as degradation seen in data drift. Then, just click Accept and replace to submit the change.

Figure 33. Accept and replace modelFigure 33. Accept and replace model

With the governance workflow enabled reviewers will be notified that the pending change is ready for their review, and the update will occur once it has been approved. In the case that you do not have governance workflow enabled for model replacement, the update is immediate for the deployment.

Figure 34. GovernanceFigure 34. Governance

Now you’ll see the new model located in the History column on the deployment Overview page. Navigating through the Service Health, Data Drift, and Accuracy pages, you’ll find the same dropdown menu allowing you to select a version of the model you want to explore.

Figure 35. VersionsFigure 35. Versions

Step 8: Set up Notifications

All machine learning models tend to degrade over time. While DataRobot does monitor your deployment in real-time, you can always check on it to review the model health. To further assist you, DataRobot provides automated monitoring with a notification system. You can configure notifications to alert you when the service health, incoming prediction data, or model accuracy exceed your defined acceptable levels.

To configure the conditions for notifications, navigate to the Deployment Settings menu, and click Notifications.

You have three options for control notification delivery via email:

  • Receive all event notifications—includes critical, at risk, and scheduled delivery emails,
  • Receive only the critical event emails, or
  • Disable notifications for the deployment

Figure 36. Types of notifications settingsFigure 36. Types of notifications settings

The Monitor tab is where you set exactly what values trigger the notifications. Users that have the role of “Owner” will be able to modify these settings; however, any user with whom the deployment has been shared can configure the level for the notifications that they want to receive, as shown on the Notifications tab. A user that isn’t an owner of the deployment can still view the same settings information.

Monitoring is available for Service Health, Data Drift, and Accuracy. The checkbox enables notification delivery at regularly scheduled intervals, ranging from minimally on the hour for service health, all the way to as long as once a quarter, which is available for all three performance monitors.

Figure 37. Monitoring settings for deployment notificationsFigure 37. Monitoring settings for deployment notifications

Step 9: Governance

MLOps governance provides your organization with a rights management framework for your model development workflow and process. Certain users are designated to review and approve events related to your deployments. The types of controllable events include creating or deleting deployments, and replacing the underlying model in a deployment.

Figure 38. Deployment that needs approval (governance applied)Figure 38. Deployment that needs approval (governance applied)

With governance approval workflow enabled, before you deploy a model you’re prompted to assign an importance level to it: Critical, High, Moderate, or Low. The importance level helps you prioritize your deployments and the way you manage them. How you specify importance for a deployment is going to be based on the factors that drive the business value for where and how you’re applying the model. Typically this reflects a collection of these factors, such as the amount of prediction volume, the potential financial impact, or any regulatory exposure.

Figure 39. Importance levels for deploymentsFigure 39. Importance levels for deployments

Once the deployment is created, reviewers are alerted via email that it requires review. Reviewers are users who are assigned the role of an MLOps deployment administrator; approving deployments is one of their primary functions. While awaiting review, the deployment will be flagged as “NEEDS APPROVAL” in the Deployments dashboard (Deployments List). When reviewers access a deployment that needs approval, they will see a notification and be prompted to begin the review process.

Figure 40. Deployment "Needs Approval"Figure 40. Deployment "Needs Approval"


DataRobot’s MLOps platform provides you with one place to manage all your production models, regardless of where they are created or deployed. You can now deliver the value of AI by simplifying the deployment and management of models from multiple machine learning platforms in production. This allows you to proactively manage production models to prevent production issues, ensuring both model trust and performance. This includes live model health monitoring with real-time dashboards, automated monitoring alerts on data deviations, and key model metrics.

When your model is found to have degraded, MLOps model replacement makes your models “hot-swappable” to streamline the model update process without interrupting existing business processes. And with Governance applied, you can safely scale AI projects and maintain control over production models to minimize risk and comply with regulations.

Labels (2)
Version history
Revision #:
22 of 22
Last update:
‎05-29-2020 05:59 PM
Updated by: