Re: Is there a prebuilt mlflow "flavor" for dataro - DataRobot Community

jonathan-dufault-kr · ‎11-24-2022

I access datarobot through python for training/scoring/evaluating models from databricks.

I was wondering if there's a pre-built datarobot flavor for mlflow, or if anyone has worked on making an mlflow model for data robot?

I'm not seeing anything after a solid day of research. I've been making my own with pyfunc, but this is a hail mary to see if anyone else has also worked on it.

Abdul.J · ‎11-25-2022

Hi,

We have notebooks with papermill and mlflow for tracking experiments on use cases. Please let us know what exactly are you looking for in using mlflow? Is it metric and artifact tracking?

jonathan-dufault-kr · ‎11-25-2022

Yes and I'd be interested in whatever you have, at whatever stage it is (reasons at the end). I'll preface it that I'm getting familiar with mlflow right now, by no means an expert/know all of what it can do yet. The biggest reason is that mlflow is a core part of the recommended machine learning workflow on databricks. Datarobot is a (if not the) core part of our machine learning stack, so I'm left trying to reconcile these tools.

This current project uses datarobot time aware modeling. I'm working with a business team at the local site, and with analysts on our team. The mlflow features I'm currently looking at are definitely metrics logging, parameter logging, associating that with a project id and model id, independent trial and error among analysts on the team (and reconciling/housing/evaluating them in a central place).

Also, without a template for datarobot with mlflow, I'm having to think through and implement what-I-think-is-a-good-workflow (e.g. what parameters to log and where, what diagnostics to save, what metadata should be saved by default, how do I represent the leaderboard, what does making the model available for other users look like, what does 'production' look like, ...)

It would be nice if there was a framework that helped gently reinforced/made it easy to follow best practices with datarobot.

Abdul.J · ‎11-29-2022

Sure Jonathan, will reach out to you on your email.

Doctor Youness · ‎02-26-2023

I'm not aware of any pre-built mlflow "flavor" specifically for Datarobot, but you may be able to integrate Datarobot with mlflow through the Pyfunc flavor.

The Pyfunc flavor allows you to package any Python function that can be used to make predictions on new data into a format that can be deployed with mlflow. You can define a function that loads the Datarobot model and use it as the predict function in the Pyfunc model.

Here's an example code snippet that shows how you can define a Pyfunc model that uses a Datarobot model for predictions:

[["

import mlflow.pyfunc
import datarobot as dr

# Define the function that loads the Datarobot model and makes predictions
def datarobot_predict(data):
# Load the Datarobot model
model = dr.Model.get('YOUR_MODEL_ID')

# Make predictions on the input data
predictions = model.predict(data)

return predictions

# Define the Pyfunc model
class DatarobotModel(mlflow.pyfunc.PythonModel):
def __init__(self):
pass

def predict(self, context, model_input):
return datarobot_predict(model_input)

# Log the Pyfunc model with mlflow
mlflow.pyfunc.log_model(
"datarobot_model",
python_model=DatarobotModel(),
artifacts={}
)""]]

In this example, you would replace "YOUR_MODEL_ID" with the actual ID of your Datarobot model.

After logging the model with mlflow, you can use the standard mlflow deployment tools to deploy the model to different target environments.

I hope this helps! Let me know if you have any more questions.

Is there a prebuilt mlflow "flavor" for datarobot?

Is there a prebuilt mlflow "flavor" for datarobot?

Notebooks

Prediction API

Python

REST API

Many Fold CV

OTV Partitioning

Dataset split

Data for Visual AI

Model factory for clustered time series models