How does DataRobot handle new values in categorical variable during the prediction?
Solved! Go to Solution.
Dear @halonest ,
as my colleagues @rangga.ugahari and @shaz13 have provided solutions to your query at the different stages of the machine learning lifecycle I hope they are satisfactory answers for you.
Your question specifically asks how DR handles it at prediction which implies the deployment treatment of the model in production and @shaz13 suggestion is further elaborated on in the DataRobot University's instructor-led MLOps I course:
Adding on that - You can also set Humility rules during deployment if in case your scope changes or you want to handle uncertain predictions.
Read more here - Humility Tab
It depends on the blueprint, but in general new levels are assigned to the “low count” category in the context of training data. So, a category that is not present in the training data may be considered similar to “low count”. Eventually, if those categories become major, we probably will start seeing the model performance deteriorate and then it’s time to retrain. For this case, maybe you also want to leverage DataRobot Continuous AI for automated model retraining.