cancel
Showing results for 
Search instead for 
Did you mean: 

questions about Downsampling in Datarobot

questions about Downsampling in Datarobot

I found this article on Downsampling in combination with Upweighting: https://developers.google.com/machine-learning/data-prep/construct/sampling-splitting/imbalanced-dat...

The article's authors elaborate that artificial weights put on the Downsampled class help in correcting the bias in model scores (F1 score) -- which is important because bias is a side-effect of Downsampling the training data. So now I’m wondering if DataRobot already automatically does Upweighting?

1 Solution

Accepted Solutions

In our smart downsampling, we "upweight" the class which has been down-sampled. We then use these weights to preserve the original class ratio in both modeling and evaluating the models (i.e. it ensure statistics like F1 are now skewed or biased), e.g. note that instead of "LogLoss" used & reported as the optimization metric, it will be "Weighted LogLoss". The "Weights" function in the Advanced Options tab can be used if you chose to downsample the dataset prior to ingest into DataRobot, but you still want to represent the original class balance in the dataset prior to your downsampling - this will work in the same way as above.

View solution in original post

1 Reply

In our smart downsampling, we "upweight" the class which has been down-sampled. We then use these weights to preserve the original class ratio in both modeling and evaluating the models (i.e. it ensure statistics like F1 are now skewed or biased), e.g. note that instead of "LogLoss" used & reported as the optimization metric, it will be "Weighted LogLoss". The "Weights" function in the Advanced Options tab can be used if you chose to downsample the dataset prior to ingest into DataRobot, but you still want to represent the original class balance in the dataset prior to your downsampling - this will work in the same way as above.