cancel
Showing results for 
Search instead for 
Did you mean: 

Use the REST API to get more meta-data on Prediction jobs

Eu Jin
Data Scientist
Data Scientist

Use the REST API to get more meta-data on Prediction jobs

So in the UI you can see the prediction jobs that list all various batch production jobs made whether from the REST API code itself, or via our DataRobot python API client or via the Job definitions. Here's the screenshot of the said UI that i'm referring too  

 

Prediction Jobs UIPrediction Jobs UI

 

You can get more details on each of those predictions via REST API and I'll show you how in this post.  
 
But before we dive into the REST API, we'll need python to help our task. So let's import our libraries and establish our DataRobot client 
 

 

 

# Importing the libraries
import datarobot as dr
import pandas as pd
import os
import requests
import getpass

print(os.getcwd())
token = getpass.getpass() # use your own token
dr.Client(token = token, endpoint= "https://app.datarobot.com/api/v2" )<p> </p><p><span>Then run this code below to pull the equivalent of the UI I showed earlier. <li-code lang="python">API_ENDPOINT = "https://app.datarobot.com/api/v2/batchPredictions"

# your API key here 
API_KEY = token
session = requests.Session()
session.headers = {
    'Authorization': 'Bearer {}'.format(API_KEY),
}

# get all the infor
resp = session.get(API_ENDPOINT)
print(resp.status_code)
df = pd.json_normalize(resp.json()["data"])
df.head()

 

 

 
This is what we get, which is alot more than what's provided in the UI: 
MLOps-code.png

 

 

log1 = pd.DataFrame(df.iloc[1,])
with pd.option_context('display.max_rows', 1000, 'display.max_columns', 1000):  # more options can be specified also
    display(log1)

 

 

 

Unlike the UI, the REST API can give me up to 73 datapoints about the prediction job.

 

logsdetails.png

 

Looking through the 2nd job in more detail, I can see that 
  • Status details - It's missing a few columns
  • User name and full name - seems to be Eu Jin Lok and his email is listed
  • Source - seems to be using the UI make prediction method (ie. Not using job definitions or the python API client for Batch Predictions) 
  • DatasetID and DeploymentID - are available which I can track down further 

 

So I'm going to inform this person about this failed job plus a few other handy information above for him to quickly resolve the problem:

  1. URL link to the dataset used for prediction
  2. URL link to the deployment 
  3. URL link to the project 
  4. URL link to the dataset used for training 

     

    • The first one is easy. We know the Dataset ID (5ebc89d21b7b850de6ab9a36) from the logs above so we just need to run this code: 

 

datasetid = "5ebc89d21b7b850de6ab9a36"
dataset = dr.Dataset.get(datasetid)
print(dataset)
print('https://app.datarobot.com/ai-catalog/'+datasetid)

 

 

Dataset(name='DEMO_LOCATION_AI_Melbourne_House_Prices ', id='5ebc89d21b7b850de6ab9a36')

https://app.datarobot.com/ai-catalog/5ebc89d21b7b850de6ab9a36

 

Just to check the url works, I click on URL from the output above and this is what I see in the UI:

 

predictiondataAICat.png

 

  • Next, the deployment URL is also easy to get, as we also got the Deployment ID from the meta data (6290a642f2d99680864daad8). We run this code: 

 

 

deploymentid = "6290a642f2d99680864daad8"
deployment = dr.Deployment.get(deploymentid)
print(deployment)
print('https://app.datarobot.com/deployments/'+deploymentid)

 

 

Deployment(DEMO_MEL_House_prices) https://app.datarobot.com/deployments/6290a642f2d99680864daad8

 

Click the link brings us to this: 

 

DeploymentsOverview.png

 

  • To get the URL link of the project I need to retrieve the project ID from the deployment first: 

 

# The deployment ID 
deploymentid = "6290a642f2d99680864daad8"

# defining the api-endpoint 
API_ENDPOINT =  "https://app.datarobot.com/api/v2/deployments/"

# your API key here 
API_KEY = token
session = requests.Session()
session.headers = {
    'Authorization': 'Bearer {}'.format(API_KEY),
}

# get all the infor
resp = session.get(API_ENDPOINT+deploymentid)
df = pd.json_normalize(resp.json())
df.T.head(15)

 

 

This is the deployment information, and the Project ID is given: 
DeploymentsInfo.png

 

 

Now I have the project ID (62908fa8929e0d7ef66e388e) I can now provide the Project URL like this: 

 

 

projectid = "62908fa8929e0d7ef66e388e"
project = dr.Project.get(projectid)
print(project)
print('https://app.datarobot.com/projects/'+projectid)

 

 

Project(DEMO_Melbourne_house_prices_NoGIS) https://app.datarobot.com/projects/62908fa8929e0d7ef66e388e

 

 

  • Finally we will pull the URL for the dataset used for training, and using the Project ID above, we'll pull some meta data to get the training dataset ID: 

 

 

 

# The deployment ID 
projectid = "62908fa8929e0d7ef66e388e"

# defining the api-endpoint 
API_ENDPOINT =  "https://app.datarobot.com/api/v2/projects/"

# your API key here 
API_KEY = token
session = requests.Session()
session.headers = {
    'Authorization': 'Bearer {}'.format(API_KEY),
}

# get all the infor
resp = session.get(API_ENDPOINT+'?projectId='+projectid)
df = pd.json_normalize(resp.json())
df.T

 

 

I get this below:

 

projectmeta.png

 

So there's a Catalog ID which tells me the project was first created using a dataset in the AI Catalog. So lets go find it: 

 

datasetid = "629086ace265bd23ab9c1de7"
print('https://app.datarobot.com/ai-catalog/'+datasetid)

 

 

https://app.datarobot.com/ai-catalog/629086ace265bd23ab9c1de7

 

Finally, clicking on the link brings me to this: 

TrainingDataset.png

 

 

So in summary I've obtained the reason for the failed prediction job (ie. Incorrect prediction dataset provided to the deployment) and the respective URL links to the project, deployment, training and prediction dataset to check out and compare the differences. All of this is done by calling the REST API itself. The DataRobot Python API client can do most of this too but the REST API will give you a lot more!  

 

0 Kudos
0 Replies