Hello!
During the test of the DataRobot platform, I had met some uncertainty.
After the final model is built I can easily get Validation and Holdout performance metric values.
My question connected with Train partition performance. As I understand Train performance can be calculated in 2 different ways:
The problem is different metric and predicted values (the difference is significant) on the Train(only) partition of the dataset if we compare 2 approaches of getting it.
In the above example I got the next values of AUC:
Validation :
DataRobot UI - 0.7638
Approach 1 - 0.7638
Approach 2 - 0.7638
Holdout :
DataRobot UI - 0.7585
Approach 1 - 0.7585
Approach 2 - 0.7585
Train:
DataRobot UI - ???
Approach 1 - 0.7546
Approach 2 - 0.8313
Solved! Go to Solution.
So this is a great and careful analysis!
DataRobot has tried to build some safeguards into getting predictions on training data. As you are aware of, if you train on some data and then ask for predictions on that data - you will get unrealistically high performance.
As a guardrail in DataRobot, when you perform predictions using the Training Data option, we use stacked predictions. With stacked predictions, DataRobot builds multiple models on different subsets of the data. The prediction for any row is made using a model that excluded that data from training. In this way, each prediction is effectively an “out-of-sample” prediction.
If you just upload the training data, you get around that safeguard and end up with misleadingly accurate predictions (overfit).
You can see more details about this in the documentation in the section on Make Predictions Tab, https://app2.datarobot.com/docs/predictions/ui/predict.html (this link works for trial users)