DataRobot MLOps (Machine Learning Operations) is a flexible product that makes it possible to deploy, govern, and monitor ML models. Customers are not limited to serve only DataRobot models using the dedicated scalable prediction servers inside the DataRobot cluster, but have an option to deploy DataRobot models into their own Kubernetes (K8s) clusters as well. In doing so, they don’t lose all the advantages of the model monitoring provided by DataRobot’s model monitoring platform, such as service health, data drift, etc. These exportable DataRobot models are called portable prediction servers (PPSs) and are developed similar to Docker containers with all their flexibility and portability.
Unifying the portability of DataRobot model Docker images with the scalability inherent to a K8s platform results in a powerful ML solution ready for production usage.
This tutorial will guide you through the step-by-step process of DataRobot model deployment on Google Cloud Platform (GCP), Google Kubernetes Engine (GKE).
The Kaggle housing prices dataset (https://www.kaggle.com/c/home-data-for-ml-course/data) has been used in this tutorial. Once Autopilot finishes model building, you can create and download the MLOps model package. To do this, navigate to the Models tab. Select the model you want, and click Predict > Downloads. In the MLOps Package section, select Generate & Download.
Figure 3. Generate & Download MLOps package
The model package (.mlpkg file) containing all the necessary information about the model is generated.
Now you are ready to create a Docker container image.
Note: First, you need to contact DataRobot support to get more information on how you can access the PPS base image.
Once you have the PPS base image, use the following Dockerfile to generate an image that includes the DataRobot model. The .mlpkg file will be copied into the Docker image so make sure the Dockerfile and .mlpkg file are in the same folder.
Figure 4. Dockerfile
The generated image will contain the DataRobot model and the monitoring agent used to transfer the metrics about service and model health back to the DataRobot MLOps platform.
This step is often considered as optional but our advice is to always test your image locally to save time and network bandwidth since the size of containers can be in the order of tens of gigabytes.
You need to upload the container image to a registry so that your Google Kubernetes Engine (GKE) cluster can download and run it.
Note: When trying to push to the Container Registry, you may get the permission issue 'storage.buckets.create'. If encountered, please reach out to the administrator of your GCP account.
Now that the Docker image is stored in Container Registry, you need to create a GKE cluster.
Note: trying to create GKE cluster you can get the following issue:
ERROR: (gcloud.container.clusters.create) ResponseError: code=400, message=The user does not have access to service account "default". Ask a project owner to grant you the iam.serviceAccountUser role on the service account.
If encountered, please reach out to the administrator of your GCP account.
The default service type in GKE is called ClusterIP, where the service gets an IP address reachable only from inside the cluster. To expose a Kubernetes service outside the cluster, you will create a service of type LoadBalancer. This type of service spawns an External Load Balancer IP for a set of pods, reachable via the internet.
Update the K8s deployment configuration with the PPS and monitoring agent configuration. Add these environment variables into the K8s Deployment configuration (see the appendix for the complete configuration file):
PORTABLE_PREDICTION_API_WORKERS_NUMBER=2
PORTABLE_PREDICTION_API_MONITORING_ACTIVE=True
PORTABLE_PREDICTION_API_MONITORING_SETTINGS=output_type=output_dir;path=/tmp;max_files=50;file_max_size=10240000;model_id=<your mlops_model_id_obtained_at_step_8>;deployment_id=<your mlops_deployment_id_obtained_at_step_8>
MONITORING_AGENT=True
MONITORING_AGENT_DATAROBOT_APP_URL=https://app.datarobot.com/
MONITORING_AGENT_DATAROBOT_APP_TOKEN=<your token>
Upgrading the deployed Docker image is a straightforward process.
This tutorial explained how to deploy and monitor DataRobot models on the GCP platform via a Portable Prediction Server (PPS). A PPS is based on Docker containers and contains a DataRobot model with embedded monitoring agents. Using this approach, a DataRobot model is made available via a scalable deployment environment for usage, and associated data can be tracked in the centralized DataRobot MLOps dashboard with all of its monitoring and governance advantages.
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
creationTimestamp: "2020-07-08T12:47:27Z"
generation: 8
labels:
app: house-regression-app
name: house-regression-app
namespace: default
resourceVersion: "14171"
selfLink: /apis/apps/v1/namespaces/default/deployments/house-regression-app
uid: 2de869fc-c119-11ea-8156-42010a840053
spec:
progressDeadlineSeconds: 600
replicas: 5
revisionHistoryLimit: 10
selector:
matchLabels:
app: house-regression-app
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: house-regression-app
spec:
containers:
- env:
- name: PORTABLE_PREDICTION_API_WORKERS_NUMBER
value: "2"
- name: PORTABLE_PREDICTION_API_MONITORING_ACTIVE
value: "True"
- name: PORTABLE_PREDICTION_API_MONITORING_SETTINGS
value: output_type=output_dir;path=/tmp;max_files=50;file_max_size=10240000;model_id=<your_mlops_model_id>;deployment_id=<your_mlops_deployment_id>
- name: MONITORING_AGENT
value: "True"
- name: MONITORING_AGENT_DATAROBOT_APP_URL
value: https://app.datarobot.com/
- name: MONITORING_AGENT_DATAROBOT_APP_TOKEN
value: <your_datarobot_api_token>
image: gcr.io/ai-XXXXXX-XXXX/house-regression-model:v1
imagePullPolicy: IfNotPresent
name: house-regression-model
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status:
availableReplicas: 5
conditions:
- lastTransitionTime: "2020-07-08T12:47:27Z"
lastUpdateTime: "2020-07-08T13:40:47Z"
message: ReplicaSet "house-regression-app-855b44f748" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
- lastTransitionTime: "2020-07-08T13:41:39Z"
lastUpdateTime: "2020-07-08T13:41:39Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
observedGeneration: 8
readyReplicas: 5
replicas: 5
updatedReplicas: 5
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2020-07-08T12:58:13Z"
labels:
app: house-regression-app
name: house-regression-app-service
namespace: default
resourceVersion: "5055"
selfLink: /api/v1/namespaces/default/services/house-regression-app-service
uid: aeb836cd-c11a-11ea-8156-42010a840053
spec:
clusterIP: 10.31.242.132
externalTrafficPolicy: Cluster
ports:
- nodePort: 30654
port: 80
protocol: TCP
targetPort: 8080
selector:
app: house-regression-app
sessionAffinity: None
type: LoadBalancer
status:
loadBalancer:
ingress:
- ip: XX.XX.XXX.XXX