We are working in time series problem.
Basically, after one month is over from deployment; we want to retrain the model by the latest one month data. The idea is freeze the parameter and re train the existing model with latest data. So we will not build the model from the scratch, rather use existing model and train with latest data.
Is it possible to do it in DataRobot? Thanks in advance for the support
Thanks and Regards
Sorry for the delay.
Thanks for your reply. We are following the suggested process what you have mentioned. let me summarize the process..
Build separate project that contain the additional data (along with the previous data). Ensure the training parameter, random seed and the back test period is same like previous project. And select the model having same blue print and having training with 100% data.
So the previous model will have the training up to previous month data, and the new model will have the training up to this month data.
Note additional training data cannot be brought in to an existing project; it sounds like you are training with something like 12 months of data in project A, get another month worth of data, and want to add it to project A and retrain on the same model - if I'm understanding your ask correctly.
What needs to be done instead, is a new project B with a new dataset needs to be created. That could contain your full 13 months of data, or perhaps just your last rolling 12 months of data. You could run a full machine learning competition of models and view them on the leaderboard - or you can look at the model that wont the competition in project A, and inside project B, pull the same model blueprint from the repository and tell DataRobot to only run that single approach.
For deployment of the model from project B, you'd have the option of creating a new deployment entry, or replacing the model at your existing deployment with the model you created in B.