Solved: Data Drift Showing Inaccurate results - DataRobot Community

Jayant · ‎10-04-2021

Hello colleagues

Recently I trained a classification model on the platform and am observing some difference in Data distribution between training and scoring data.

Plz see screenshot.

But when I chk my actual data for the COUNTRY variable, I am not seeing any new values. If you see in the screenshot, on the far right end we are seeing NEW VALUES in scoring set.

But here are the results from actual data.

Training set;

values are PH, US, Other, GB, PK, IN, CN

And for scoring set the values are; US, PH, Other, GB, PK, IN, CN.

I am not being able to see any new levels/values for COUNTRY variable in scoring set. But why is Datarobot saying, there are new levels showing up and hence a drift.

Is this a bug. kindly advise.

jmbledsoe · ‎10-07-2021

It sounds then like sampling is not the explanation. At this point I would recommend filing a support ticket so that our team can do a more detailed investigation.

View solution in original post

Linda · ‎10-06-2021

Hi @Jayant - Sorry you haven't received help yet with this question. I've elevated it to the DataRobot team and someone should answer momentarily.

jmbledsoe · ‎10-06-2021

How large was your training dataset? I know that the training baseline only consists of a sample of the training data, but it's fairly large (about 500 MB I think) so if your training dataset it small it would encompass the entire thing.

Jayant · ‎10-07-2021

Appreciate your response. My final model, (which is being used for predictions) is trained on the entire train data, consisting of 80K rows. Plz see screenshot. I am not sure its the size of train data here, because the model is exposed to entire train data. Let me know if more information is needed here. It is crucial for me to know why is this drift occurring here.

jmbledsoe · ‎10-07-2021

It sounds then like sampling is not the explanation. At this point I would recommend filing a support ticket so that our team can do a more detailed investigation.

Linda · ‎10-07-2021

Hi @Jayant - To create a ticket, you can simply send an email to support@datarobot.com with the information you shared here. If you’d prefer, I can create the ticket - just let me know.

Thanks

Linda

Data Drift Showing Inaccurate results

Data Drift Showing Inaccurate results

MLOps

How to stop uploading

How do I upload a JDBC driver

Paxata Cache Folder

how to transform the var type in workbench

Understanding Model