Data Drift Showing Inaccurate results

Hello colleagues

Recently I trained a classification model on the platform and am observing some difference in Data distribution between training and scoring data.

Plz see screenshot.

Screen Shot 2021-10-04 at 3.39.03 PM.png 

But when I chk my actual data for the COUNTRY variable, I am not seeing any new values. If you see in the screenshot, on the far right end we are seeing NEW VALUES in scoring set.

But here are the results from actual data.

Training set;

values are PH, US, Other, GB, PK, IN, CN


And for scoring set the values are; US, PH, Other, GB, PK, IN, CN.
I am not being able to see any new levels/values for COUNTRY variable in scoring set. But why is Datarobot saying, there are new levels showing up and hence a drift.
Is this a bug. kindly advise. 
Hi @Jayant - To create a ticket, you can simply send an email to with the information you shared here. If you’d prefer, I can create the ticket - just let me know.



It sounds then like sampling is not the explanation. At this point I would recommend filing a support ticket so that our team can do a more detailed investigation.

Appreciate your response. My final model, (which is being used for predictions) is trained on the entire train data, consisting of 80K rows. Plz see screenshot. I am not sure its the size of train data here, because the model is exposed to entire train data. Let me know if more information is needed here. It is crucial for me to know why is this drift occurring here.

Screen Shot 2021-10-07 at 9.39.00 AM.png 

How large was your training dataset? I know that the training baseline only consists of a sample of the training data, but it's fairly large (about 500 MB I think) so if your training dataset it small it would encompass the entire thing.

Hi @Jayant - Sorry you haven't received help yet with this question. I've elevated it to the DataRobot team and someone should answer momentarily.