cancel
Showing results for 
Search instead for 
Did you mean: 

New values in categorical variables during prediction

New values in categorical variables during prediction

How does DataRobot handle new values in categorical variable during the prediction?

Labels (1)
0 Kudos
2 Solutions

Accepted Solutions

It depends on the blueprint, but in general new levels are assigned to the “low count” category in the context of training data. So, a category that is not present in the training data may be considered similar to “low count”. Eventually, if those categories become major, we probably will start seeing the model performance deteriorate and then it’s time to retrain. For this case, maybe you also want to leverage DataRobot Continuous AI for automated model retraining.

View solution in original post

shaz13
DataRobot Alumni

Adding on that - You can also set Humility rules during deployment if in case your scope changes or you want to handle uncertain predictions.

Read more here - Humility Tab

View solution in original post

3 Replies

It depends on the blueprint, but in general new levels are assigned to the “low count” category in the context of training data. So, a category that is not present in the training data may be considered similar to “low count”. Eventually, if those categories become major, we probably will start seeing the model performance deteriorate and then it’s time to retrain. For this case, maybe you also want to leverage DataRobot Continuous AI for automated model retraining.

shaz13
DataRobot Alumni

Adding on that - You can also set Humility rules during deployment if in case your scope changes or you want to handle uncertain predictions.

Read more here - Humility Tab

Dear @halonest ,

as my colleagues @rangga.ugahari and @shaz13 have provided solutions to your query at the different stages of the machine learning lifecycle I hope they are satisfactory answers for you.

Your question specifically asks how DR handles it at prediction which implies the deployment treatment of the model in production and @shaz13 suggestion is further elaborated on in the DataRobot University's instructor-led MLOps I course:

https://university.datarobot.com/mlops-i

 

 

0 Kudos