Solved: Re: Lift Chart : what are the bins ? - DataRobot Community

Jean-Robot · ‎11-24-2022

Hi community,

I'm starting on DataRobot and more generally on ML.

I'm trying to understand the Lift Chart and I don't get what the bins means. Are they regular groups of random sampled records ?

Thank you for your help !

IraWatt · ‎11-24-2022

Hey @Jean-Robot 👋,

"DataRobot creates the Lift Chart by sorting predictions in increasing order and then grouping them into equal-sized bins." Lift Chart: DataRobot docs

So, if for instance you were creating a classification model the lift chart for one of your classes could have 10 bins, bin 1 would contain the 10% of the predictions with the lowest prediction of that row being a specific class.

Thats my understanding of it from reading Lift Chart: DataRobot docs

All the best,

Ira

View solution in original post

Bogdan Tsal-Tsalko · ‎11-24-2022

Binary classification prediction initially is continuous between 0 and 1, which after application of threshold will be assigned to 0 or 1.
Prediction of class being "1" will be used to build Lift chart.

View solution in original post

IraWatt · ‎11-24-2022

Hey @Jean-Robot 👋,

"DataRobot creates the Lift Chart by sorting predictions in increasing order and then grouping them into equal-sized bins." Lift Chart: DataRobot docs

So, if for instance you were creating a classification model the lift chart for one of your classes could have 10 bins, bin 1 would contain the 10% of the predictions with the lowest prediction of that row being a specific class.

Thats my understanding of it from reading Lift Chart: DataRobot docs

All the best,

Ira

Bogdan Tsal-Tsalko · ‎11-24-2022

Hi, Jean!

Lift chart is a great tool to quickly understand the performance of your model.

Bins are not predefined and not random as you assumed.

Once a model is trained it can do predictions, let's for example take the holdout part of the data predicted. Then we sort predictions, and split them to equally sized chunks - those are bins. So, as the result, if there are 5 bins, in the first bin will be 20% of smallest predictions, in the second bin next 20%, and so on to 5th bin with 20% of biggest predictions.

Second line (orange one) is actuals - for each prediction by model there are actual values. Both lines are created by averaging predictions and actuals inside the bin.

Finally we have two lines - monotonically growing predictions and actuals (that may not be monotonic) that we need to be as close as possible to predictions.

Less bins on chart mean less noise in actuals, more bins lead to better distribution representation of predictions.

Hope this helps!

Jean-Robot · ‎11-24-2022

Hey @IraWatt ! Thank you for the explanation !

I'm not sure I'm getting it entirely. What is the prediction score used to sort the rows ?

Jean-Robot · ‎11-24-2022

For a binary classification, are the class 0 and 1 ?

Jean-Robot · ‎11-24-2022

Hi @Bogdan Tsal-Tsalko ! Thank you as well for your explanation.

My understanding problem is about the "predictions" and how they are used to assign rows to bins. Is it a numerical value ?

Bogdan Tsal-Tsalko · ‎11-24-2022

Binary classification prediction initially is continuous between 0 and 1, which after application of threshold will be assigned to 0 or 1.
Prediction of class being "1" will be used to build Lift chart.

Jean-Robot · ‎11-24-2022

So the lift chart is built after predictions score computation and before threshold application ?

As prediction of class 1 is used to build the Lift chart, the first bin gathers the farthest predictions score from 1 (closest to 0) ?

Bogdan Tsal-Tsalko · ‎11-24-2022

Yes, you are correct!

Threshold is not being used for Lift Chart, only raw predictions of class being "1".
Worth mentioning:

One can select what positive class will be
For threshold selection I suggest using ROC Curve tab under Evaluate section

Jean-Robot · ‎11-24-2022

Thank you @IraWatt and @Bogdan Tsal-Tsalko ! I feel more confident about Lift Chart interpretation ! 🙂

Lift Chart : what are the bins ?

Lift Chart : what are the bins ?

Modeling

How to stop uploading

How do I upload a JDBC driver

Paxata Cache Folder

how to transform the var type in workbench

Understanding Model