cancel
Showing results for 
Search instead for 
Did you mean: 


Finding a dataset ID

Tom B
Blue LED

Hello.  I have  a model which I've built using the UI and I now want to extract information about it via the R API.  I can find the project ID easily enough, by e.g.;

res = GET(
url = "https://app.eu.datarobot.com/api/v2/projects/",
add_headers("Authorization" = paste("Bearer", token, sep = " ")),
encode = "json",
verbose()
)

dt_projects <- data.table(fromJSON(content(res, as = "text")))
project_id <- dt_projects[projectName == projName, id]

 

What I can't see how to do anywhere is extract the dataset id related to this project.  Any ideas?

Thanks, Tom

 

 

Labels (2)
5 Replies
IraWatt
Linear Actuator

Hi @Tom B,

I'm not an active user of the R API but the R Package Docs lists two functions 'ListDataSources' and 'ListDataStores' one of which may be what your looking for. 

All the best,

Ira 

0 Kudos
Tom B
Blue LED

Thanks @IraWatt  - Good shout and this might indeed be the intended solution but for me it is not: both are R functions with no arguments, which just return me an empty list.

0 Kudos
RonBuckley
DataRobot Employee
DataRobot Employee

Using the project_id you can get details about the project. 

 

Example I have projectId 615442d2c3b2a5d486280268:

https://app.datarobot.com/api/v2/projects/615442d2c3b2a5d486280268/

 

The response for that is:

{"id": "615442d2c3b2a5d486280268", "projectName": "10k_diabetes.csv", "fileName": "10k_diabetes.csv", "stage": "modeling", "autopilotMode": 0, "created": "2021-09-29T10:41:41.499215Z", "target": "readmitted", "metric": "LogLoss", "partition": {"cvMethod": "stratified", "validationType": "CV", "holdoutPct": 20.0, "reps": 5, "useTimeSeries": null, "validationLevel": null, "datetimeCol": null, "cvHoldoutLevel": null, "holdoutLevel": null, "trainingLevel": null, "userPartitionCol": null, "validationPct": null, "partitionKeyCols": null}, "recommender": {"isRecommender": null, "recommenderItemId": null, "recommenderUserId": null}, "advancedOptions": {"weights": null, "blueprintThreshold": 3, "responseCap": false, "seed": null, "scaleoutModelingMode": "disabled", "defaultMonotonicIncreasingFeaturelistId": null, "defaultMonotonicDecreasingFeaturelistId": null, "onlyIncludeMonotonicBlueprints": false, "shapOnlyMode": false, "runLeakageRemovedFeatureList": true, "smartDownsampled": false, "majorityDownsamplingRate": null, "downsampledMinorityRows": null, "downsampledMajorityRows": null, "blendBestModels": true, "prepareModelForDeployment": true, "considerBlendersInRecommendation": false, "scoringCodeOnly": false}, "positiveClass": 1.0, "maxTrainPct": 64.0, "maxTrainRows": 6400, "scaleoutMaxTrainPct": 64.0, "scaleoutMaxTrainRows": 6400, "holdoutUnlocked": false, "catalogId": "615326e4a8a592c8ff27ff81", "catalogVersionId": "615326e5a8a592c8ff27ff82", "externalTimeSeriesBaselineDatasetMetadata": null, "segmentation": null, "targetType": "Binary", "unsupervisedMode": false, "useFeatureDiscovery": false, "quickrun": true, "automodelDeploymentId": null}

 

If the project was built from the AI Catalog the response will contain the datasetId as:

"catalogId": "615326e4a8a592c8ff27ff81"

 

 

The inverse is also available, if you have the datasetId you can get it's associated projects:

https://app.datarobot.com/api/v2/datasets/615326e4a8a592c8ff27ff81/projects/

 

{"count": 1, "next": null, "previous": null, "data": [{"id": "615442d2c3b2a5d486280268", "url": "https://app.datarobot.com/api/v2/projects/615442d2c3b2a5d486280268/"}], "totalCount": 1}

 

 

 

0 Kudos
Tom B
Blue LED

Interesting... I built the particular model I'm trying to interrogate using the platform, and my project has catalogId = NULL.  Indeed every single ID listed on the page (https://app.eu.datarobot.com/api/v2/projects/61********31) is null, apart from the project ID.

 

It may be quickest that I just re-upload a dataset, or a sample of it, via the API - which is practical for this task but obviously not always ideal!

0 Kudos
RonBuckley
DataRobot Employee
DataRobot Employee

datasetID only applies to datasets in AI Catalog.     

 

If a project is created by uploading a dataset directly (like via URL,  drag and drop, local file upload, api project creation, ...)   it wont have an associated AI Catalog entry, and wont have an associated datasetId.