cancel
Showing results for 
Search instead for 
Did you mean: 

Lift Chart : what are the bins ?

Lift Chart : what are the bins ?

Hi community,

 

I'm starting on DataRobot and more generally on ML.

I'm trying to understand the Lift Chart and I don't get what the bins means. Are they regular groups of random sampled records ?

 

Thank you for your help !

Labels (1)
2 Solutions

Accepted Solutions

Hey @Jean-Robot 👋,

"DataRobot creates the Lift Chart by sorting predictions in increasing order and then grouping them into equal-sized bins." Lift Chart: DataRobot docs 

So, if for instance you were creating a classification model the lift chart for one of your classes could have 10 bins, bin 1 would contain the 10% of the predictions with the lowest prediction of that row being a specific class. 

Thats my understanding of it from reading Lift Chart: DataRobot docs

All the best,

Ira

View solution in original post

Binary classification prediction initially is continuous between 0 and 1, which after application of threshold will be assigned to 0 or 1.
Prediction of class being "1" will be used to build Lift chart.

View solution in original post

0 Kudos
9 Replies

Hey @Jean-Robot 👋,

"DataRobot creates the Lift Chart by sorting predictions in increasing order and then grouping them into equal-sized bins." Lift Chart: DataRobot docs 

So, if for instance you were creating a classification model the lift chart for one of your classes could have 10 bins, bin 1 would contain the 10% of the predictions with the lowest prediction of that row being a specific class. 

Thats my understanding of it from reading Lift Chart: DataRobot docs

All the best,

Ira

Hi, Jean!

 

Lift chart is a great tool to quickly understand the performance of your model.

Bins are not predefined and not random as you assumed.

Once a model is trained it can do predictions, let's for example take the holdout part of the data predicted. Then we sort predictions, and split them to equally sized chunks - those are bins. So, as the result, if there are 5 bins, in the first bin will be 20% of smallest predictions, in the second bin next 20%, and so on to 5th bin with 20% of biggest predictions.

Second line (orange one) is actuals - for each prediction by model there are actual values. Both lines are created by averaging predictions and actuals inside the bin.

Finally we have two lines - monotonically growing predictions and actuals (that may not be monotonic) that we need to be as close as possible to predictions.

Less bins on chart mean less noise in actuals, more bins lead to better distribution representation of predictions.

Hope this helps!

Hey @IraWatt ! Thank you for the explanation !

 

I'm not sure I'm getting it entirely. What is the prediction score used to sort the rows ?

0 Kudos

For a binary classification, are the class 0 and 1 ?

0 Kudos

Hi @Bogdan Tsal-Tsalko ! Thank you as well for your explanation.

 

My understanding problem is about the "predictions" and how they are used to assign rows to bins. Is it a numerical value ?

0 Kudos

Binary classification prediction initially is continuous between 0 and 1, which after application of threshold will be assigned to 0 or 1.
Prediction of class being "1" will be used to build Lift chart.

0 Kudos

So the lift chart is built after predictions score computation and before threshold application ?

 

As prediction of class 1 is used to build the Lift chart, the first bin gathers the farthest predictions score from 1 (closest to 0) ?

Yes, you are correct!

Threshold is not being used for Lift Chart, only raw predictions of class being "1".
Worth mentioning:

Thank you @IraWatt and @Bogdan Tsal-Tsalko ! I feel more confident about Lift Chart interpretation ! 🙂