cancel
Showing results for 
Search instead for 
Did you mean: 

Finding a dataset ID

Finding a dataset ID

Hello.  I have  a model which I've built using the UI and I now want to extract information about it via the R API.  I can find the project ID easily enough, by e.g.;

res = GET(
url = "https://app.eu.datarobot.com/api/v2/projects/",
add_headers("Authorization" = paste("Bearer", token, sep = " ")),
encode = "json",
verbose()
)

dt_projects <- data.table(fromJSON(content(res, as = "text")))
project_id <- dt_projects[projectName == projName, id]

 

What I can't see how to do anywhere is extract the dataset id related to this project.  Any ideas?

Thanks, Tom

 

 

Labels (1)
1 Solution

Accepted Solutions

datasetID only applies to datasets in AI Catalog.     

 

If a project is created by uploading a dataset directly (like via URL,  drag and drop, local file upload, api project creation, ...)   it wont have an associated AI Catalog entry, and wont have an associated datasetId.

 

View solution in original post

5 Replies

Hi @Tom B,

I'm not an active user of the R API but the R Package Docs lists two functions 'ListDataSources' and 'ListDataStores' one of which may be what your looking for. 

All the best,

Ira 

0 Kudos

Thanks @IraWatt  - Good shout and this might indeed be the intended solution but for me it is not: both are R functions with no arguments, which just return me an empty list.

0 Kudos

Using the project_id you can get details about the project. 

 

Example I have projectId 615442d2c3b2a5d486280268:

https://app.datarobot.com/api/v2/projects/615442d2c3b2a5d486280268/

 

The response for that is:

{"id": "615442d2c3b2a5d486280268", "projectName": "10k_diabetes.csv", "fileName": "10k_diabetes.csv", "stage": "modeling", "autopilotMode": 0, "created": "2021-09-29T10:41:41.499215Z", "target": "readmitted", "metric": "LogLoss", "partition": {"cvMethod": "stratified", "validationType": "CV", "holdoutPct": 20.0, "reps": 5, "useTimeSeries": null, "validationLevel": null, "datetimeCol": null, "cvHoldoutLevel": null, "holdoutLevel": null, "trainingLevel": null, "userPartitionCol": null, "validationPct": null, "partitionKeyCols": null}, "recommender": {"isRecommender": null, "recommenderItemId": null, "recommenderUserId": null}, "advancedOptions": {"weights": null, "blueprintThreshold": 3, "responseCap": false, "seed": null, "scaleoutModelingMode": "disabled", "defaultMonotonicIncreasingFeaturelistId": null, "defaultMonotonicDecreasingFeaturelistId": null, "onlyIncludeMonotonicBlueprints": false, "shapOnlyMode": false, "runLeakageRemovedFeatureList": true, "smartDownsampled": false, "majorityDownsamplingRate": null, "downsampledMinorityRows": null, "downsampledMajorityRows": null, "blendBestModels": true, "prepareModelForDeployment": true, "considerBlendersInRecommendation": false, "scoringCodeOnly": false}, "positiveClass": 1.0, "maxTrainPct": 64.0, "maxTrainRows": 6400, "scaleoutMaxTrainPct": 64.0, "scaleoutMaxTrainRows": 6400, "holdoutUnlocked": false, "catalogId": "615326e4a8a592c8ff27ff81", "catalogVersionId": "615326e5a8a592c8ff27ff82", "externalTimeSeriesBaselineDatasetMetadata": null, "segmentation": null, "targetType": "Binary", "unsupervisedMode": false, "useFeatureDiscovery": false, "quickrun": true, "automodelDeploymentId": null}

 

If the project was built from the AI Catalog the response will contain the datasetId as:

"catalogId": "615326e4a8a592c8ff27ff81"

 

 

The inverse is also available, if you have the datasetId you can get it's associated projects:

https://app.datarobot.com/api/v2/datasets/615326e4a8a592c8ff27ff81/projects/

 

{"count": 1, "next": null, "previous": null, "data": [{"id": "615442d2c3b2a5d486280268", "url": "https://app.datarobot.com/api/v2/projects/615442d2c3b2a5d486280268/"}], "totalCount": 1}

 

 

 

0 Kudos

Interesting... I built the particular model I'm trying to interrogate using the platform, and my project has catalogId = NULL.  Indeed every single ID listed on the page (https://app.eu.datarobot.com/api/v2/projects/61********31) is null, apart from the project ID.

 

It may be quickest that I just re-upload a dataset, or a sample of it, via the API - which is practical for this task but obviously not always ideal!

0 Kudos

datasetID only applies to datasets in AI Catalog.     

 

If a project is created by uploading a dataset directly (like via URL,  drag and drop, local file upload, api project creation, ...)   it wont have an associated AI Catalog entry, and wont have an associated datasetId.