OTV

OTV

Hi, I'm wondering if someone could help me with the way I'm using the OTV and partitioning data. I want to create a scoring model for default loans. I don't know if I should add a holdout and also if I need to change the validation length. Considering I don't have enough data. I have tested different ways, and I realized it changes the outcome. What would be the best approach?

0 Kudos
1 Solution

Accepted Solutions
akshay
DataRobot Alumni

Working with smaller datasets can be difficult especially if you are trying to use OTV. Do you have to resort to OTV for this particular use case or could you get away with Cross Validation? I think CV with a higher number of folds than default might yield more stable results. With Cross Validation, you can keep the holdout as it is a best practice.

If you have to do OTV, given the limited data, run as many backtests as you can reasonably create given the dataset size and dont worry about keeping the holdout.

 

View solution in original post

0 Kudos
1 Reply
akshay
DataRobot Alumni

Working with smaller datasets can be difficult especially if you are trying to use OTV. Do you have to resort to OTV for this particular use case or could you get away with Cross Validation? I think CV with a higher number of folds than default might yield more stable results. With Cross Validation, you can keep the holdout as it is a best practice.

If you have to do OTV, given the limited data, run as many backtests as you can reasonably create given the dataset size and dont worry about keeping the holdout.

 

0 Kudos