Working with Custom Models

cancel
Showing results for 
Search instead for 
Did you mean: 

Working with Custom Models

(Article updated October 2020)

This article presents a simple step-by-step guide to using your own custom, pre-trained model with DataRobot MLOps.

Overview

Figure 1. Deployments and custom modelsFigure 1. Deployments and custom models

The MLOps suite of tools for monitoring model performance and managing model lifecycle can be used with the custom, pre-trained models that you build in your development environment. As when managing a model built with DataRobot, you add your custom model to the Model Registry by creating a model package. There are two unique pieces to a model package for a custom model. In addition to providing information such as the target, the machine learning type of model, and data used to train it, you also provide the serialized model file (such as a Python pickle file of your mode), and information about the execution environment and which libraries are needed to run the model.  

Add a New Model

Figure 2. Models in Custom Model WorkshopFigure 2. Models in Custom Model Workshop

Begin by navigating to Model Registry > Custom Model Workshop. The Models page lists all the custom models you’ve created. Click Add New Model to create a new custom model package, and then supply some information that describes it: give the model a name and indicate the target feature and the target type as either binary classification or regression. Additionally, there are optional fields to provide the programming language used to build the model, and a model description. When you’ve completed all of the desired fields, click Add Custom Model to add it to the list of models available in the Workshop.  

Figure 3. Test and Deploy ModelFigure 3. Test and Deploy Model

Next, you need to add files to the new custom model package to tell MLOps how to process the prediction results. In the left pane, you upload a few individual files, or a folder of files, which varies depending on the language of your model and how you want it run. You can upload local files or you can retrieve the files remotely from an Amazon S3 bucket or a Github repository. 

Figure 4. Select remote repositoryFigure 4. Select remote repository

 At a minimum, you need just one file: the serialized model. You can also include a file with code for additional hooks that DataRobot uses, for example, to load the model, run preprocessing steps or apply any transformations. This file is custom.py for Python models, or custom.R for R models. You can also include the file requirements.txt to specify what libraries are required to run your model. 

Figure 5. Add files to custom model packageFigure 5. Add files to custom model package

Then in the right pane, select a “Drop-In” environment to use with the custom model.  In the dropdown menu, you can select from one of the two types of environments available: a pre-baked “Drop-in” environment or your own custom environment.

Figure 6. Select model environmentFigure 6. Select model environment

A “Drop-in” environment uses a preconfigured set of common libraries made to work with specific types of model algorithms. For example for Python, there are drop-in environments for scikit-learn, XGBoost, or Pytorch libraries. There are also drop-in environments for R and Java. These environments are maintained and provided by MLOps in the Custom Model Workshop, and cover almost any environment you’d want to use with your custom model. However, if you have additional libraries or library versions to specify, you can add the requirements.txt file to the left pane to incorporate those. You will see the specific library and version in the right pane, and then select Build Environment to build the new environment on top of the drop-in.

Figure 7. Drop-in environmentFigure 7. Drop-in environment

Add a New Environment

By providing an environment separate from a custom model, MLOps can build the environment as a distinct entity for you. This allows you to reuse an environment with specific requirements defined, for any of your models that need it.

Figure 8. Custom model environmentFigure 8. Custom model environment

To create a new custom environment, navigate to the Environments menu and click Add New Environment. You give the new environment a name, optionally provide description text, and then upload a ZIP or TAR archive that contains the environment files.  The archive file contains:

  • A requirements file that lists the specific libraries and library versions to load into the environment.
  • A shell script to run the model with a lightweight web server to receive the prediction requests, and
  • A docker file to install the libraries, build a Docker image, and execute the environment inside a container.
Figure 9. Example files for custom environmentFigure 9. Example files for custom environment

Note: You can find more specific information about customizing and configuring an environment on the MLOps GitHub repository and in the in-app Platform Documentation by searching Using environments with custom models.

When all fields are complete, click Addthe custom environment now is ready for use.  Over time, you may want to add a new version of the environment, for example if you want to use newer versions of libraries.  You can also see all the active deployments operating under the environment, and view the environment metadata information. 

Figure 10. Environment Information (metadata)Figure 10. Environment Information (metadata)

Test the New Model

Figure 11. Testing a drop-in environmentFigure 11. Testing a drop-in environment

 Returning to the custom model, let’s use the Python scikit-learn drop-in environment. Now, with the model paired with an environment, we’re ready to test it with a sample prediction dataset. 

The test detects if the model runs into any  errors when making predictions; you want to make sure the model handles predictions successfully before you deploy it.  Below you can see a trail of the recent tests you ran, and you can see all previous tests listed under Test tab along with a log file of any errors encountered.

Now let’s click Test Model, provide a test dataset of rows to make predictions for, and click Start Test.

Figure 12. Test the modelFigure 12. Test the model

A convenient and easy alternative to using this test engine in the user interface, is to use the DRUM tool, which is the DataRobot User Model Runner. This tool allows you to test your custom model locally in your development environment, providing you test results almost immediately to iterate quickly. This tool is available for Python models with a simple pip install command.

Figure 13. DataRobot DRUMFigure 13. DataRobot DRUM(https://pypi.org/project/datarobot-drum/) 

Adding new versions of a model or environment

If you want to update a model for any reason, such as the availability of new package versions, different preprocessing steps, or different hyperparameters, you can update the file contents to create a new version of the model, similar to updating an environment with a new version.  

Figure 14. Model versionFigure 14. Model version

To do so, select the model from the Workshop to edit it and navigate to the Assemble tab. In the Model section, you can delete any existing files you may have in the window, or select Add Files and upload the new files or folders that you want to include.

When you update the individual contents of a model, a new minor version is created (1.1, 1.2, etc.).  You can create a new major version of a model (1.0, 2.0, etc.) by selecting New version, and selecting either Copy contents of previous version (to the new version) or create empty version (and then add new files to use for the model).

Figure 15. Creating new model versionsFigure 15. Creating new model versions

You can see a list of all model versions under the Versions tab.

Figure 16. Managing model versionsFigure 16. Managing model versions

Assigning learning data to a custom inference model 

If you want to add learning data to the custom model (which allows you to deploy it), you can do so by selecting a custom model and navigating to the Model Info tab which lists attributes about a custom model.

Click Add Learning Data and a pop-up window appears, prompting you to upload the learning data used to train the model. 

Figure 17. Information for custom modelFigure 17. Information for custom model

When you’ve added the learning data, MLOps is able to determine how new incoming predictions differ from, or drift apart from, the original training data. Optionally, you can specify a column name containing the partitioning information for your data (based on training/validation/holdout partitions). When the upload is complete, click Add Learning Data. The other information presented is the data you provided when the custom model was first created.

Figure 18. Add Learning Data for training modelFigure 18. Add Learning Data for training model

Deploying the custom model

With your custom model now tested successfully, we’re ready to deploy it. This can be done simply by clicking the Deploy link in the middle of the screen. Alternatively, you can also click View Registry Package if you just want to review the model package for the custom model, but not yet deploy it. For example if you have governance in place for deployment review and approval.  

Figure 19. Deploying custom modelFigure 19. Deploying custom model

By clicking Deploy, we’re taken to the deployment information page, where some information for the custom model is automatically provided from when it was created.  The items on this page are described fully in our other content on the Deployment Details, but in summary, from here you complete the rest of the information needed to deploy the model:

  • In the Model section, almost all of the data shown was provided when the custom model was created. You can also add functional validation data to the deployment, which is used to validate the model can work with the environment, as we did similarly in the Custom Model Workshop
  • In the Inference section you can choose to track data drift, and indicate an association ID used to track model accuracy by pairing predictions made with the actual results.  DataRobot recommends you upload the learning data to your deployment if it’s available so that you can take advantage of data drift tracking.  We can see that this has already been provided from when we uploaded it a few moments ago under the Model Info tab.  
Figure 20. Configuring information for deploying custom modelFigure 20. Configuring information for deploying custom model

When you’ve added all the available data and your model is fully defined, your deployment is ready to be created.  Give the deployment a name at the top of the screen and click Create deployment. Note that by creating the deployment, a model package is created and will appear under the Model Packages tab in the Model Registry.

Figure 21. Deployment overview pageFigure 21. Deployment overview page

You can view all the deployments created from this package at any time by clicking Current Deployments from the Custom Model Workshop in the Model Registry. Or by navigating to Model Packages, finding your custom model package, and then clicking Deployments.

With the custom model package now in the Model Registry, if it hasn’t already been deployed—or to deploy it again as a new deployment—simply click to deploy from the menu on the far right, as can be done for any type of model.  

Figure 22. Select Deploy to deploy custom modelFigure 22. Select Deploy to deploy custom model

Once deployed, you’re ready to make predictions via the API, and begin to monitor and manage the deployment with the full suite of MLOps capabilities.

More information

If you’re a licensed DataRobot customer, search the in-app Platform Documentation for Creating custom inference models and Using environments with custom models.

Labels (2)
Version history
Revision #:
22 of 22
Last update:
Friday
Updated by:
 
Contributors