I'm starting on DataRobot and more generally on ML.
I'm trying to understand the Lift Chart and I don't get what the bins means. Are they regular groups of random sampled records ?
Thank you for your help !
Solved! Go to Solution.
Thank you @IraWatt and @Bogdan Tsal-Tsalko ! I feel more confident about Lift Chart interpretation ! 🙂
Yes, you are correct!
Threshold is not being used for Lift Chart, only raw predictions of class being "1".
So the lift chart is built after predictions score computation and before threshold application ?
As prediction of class 1 is used to build the Lift chart, the first bin gathers the farthest predictions score from 1 (closest to 0) ?
Binary classification prediction initially is continuous between 0 and 1, which after application of threshold will be assigned to 0 or 1.
Prediction of class being "1" will be used to build Lift chart.
Hi @Bogdan Tsal-Tsalko ! Thank you as well for your explanation.
My understanding problem is about the "predictions" and how they are used to assign rows to bins. Is it a numerical value ?
For a binary classification, are the class 0 and 1 ?
Hey @IraWatt ! Thank you for the explanation !
I'm not sure I'm getting it entirely. What is the prediction score used to sort the rows ?
Lift chart is a great tool to quickly understand the performance of your model.
Bins are not predefined and not random as you assumed.
Once a model is trained it can do predictions, let's for example take the holdout part of the data predicted. Then we sort predictions, and split them to equally sized chunks - those are bins. So, as the result, if there are 5 bins, in the first bin will be 20% of smallest predictions, in the second bin next 20%, and so on to 5th bin with 20% of biggest predictions.
Second line (orange one) is actuals - for each prediction by model there are actual values. Both lines are created by averaging predictions and actuals inside the bin.
Finally we have two lines - monotonically growing predictions and actuals (that may not be monotonic) that we need to be as close as possible to predictions.
Less bins on chart mean less noise in actuals, more bins lead to better distribution representation of predictions.
Hope this helps!
Hey @Jean-Robot 👋,
"DataRobot creates the Lift Chart by sorting predictions in increasing order and then grouping them into equal-sized bins." Lift Chart: DataRobot docs
So, if for instance you were creating a classification model the lift chart for one of your classes could have 10 bins, bin 1 would contain the 10% of the predictions with the lowest prediction of that row being a specific class.
Thats my understanding of it from reading Lift Chart: DataRobot docs
All the best,