So this is a great and careful analysis!
DataRobot has tried to build some safeguards into getting predictions on training data. As you are aware of, if you train on some data and then ask for predictions on that data - you will get unrealistically high performance.
As a guardrail in DataRobot, when you perform predictions using the Training Data option, we use stacked predictions. With stacked predictions, DataRobot builds multiple models on different subsets of the data. The prediction for any row is made using a model that excluded that data from training. In this way, each prediction is effectively an “out-of-sample” prediction.
If you just upload the training data, you get around that safeguard and end up with misleadingly accurate predictions (overfit).
You can see more details about this in the documentation in the section on Make Predictions Tab, https://app2.datarobot.com/docs/predictions/ui/predict.html (this link works for trial users)