Model Retraining after Deployment

Abhishek Saha · ‎04-28-2021

Hello,

We are working in time series problem.

Basically, after one month is over from deployment; we want to retrain the model by the latest one month data. The idea is freeze the parameter and re train the existing model with latest data. So we will not build the model from the scratch, rather use existing model and train with latest data.

Is it possible to do it in DataRobot? Thanks in advance for the support

Thanks and Regards

Abhishek Saha

Linda · ‎04-28-2021

Hi @Abhishek Saha - Welcome to the DataRobot Community! Thanks for posting your question here. I can give you a quick pointer to this similar question which was answered recently (by @doyouevendata ).

Hoping other DR experts can help with your question!

linda

doyouevendata · ‎04-28-2021

Abhishek,

Note additional training data cannot be brought in to an existing project; it sounds like you are training with something like 12 months of data in project A, get another month worth of data, and want to add it to project A and retrain on the same model - if I'm understanding your ask correctly.

What needs to be done instead, is a new project B with a new dataset needs to be created. That could contain your full 13 months of data, or perhaps just your last rolling 12 months of data. You could run a full machine learning competition of models and view them on the leaderboard - or you can look at the model that wont the competition in project A, and inside project B, pull the same model blueprint from the repository and tell DataRobot to only run that single approach.

For deployment of the model from project B, you'd have the option of creating a new deployment entry, or replacing the model at your existing deployment with the model you created in B.

Abhishek Saha · ‎05-07-2021

Sorry for the delay.

Thanks for your reply. We are following the suggested process what you have mentioned. let me summarize the process..

Build separate project that contain the additional data (along with the previous data). Ensure the training parameter, random seed and the back test period is same like previous project. And select the model having same blue print and having training with 100% data.

So the previous model will have the training up to previous month data, and the new model will have the training up to this month data.

Regards
Abhishek

Linda · ‎05-07-2021

Hi @Abhishek - can you confirm that the response from @doyouevendata was the solution you needed? Or are you looking for more ideas? Thanks

Doctor Youness · ‎05-18-2022

I've an inquiry about this:

What does this mean: 80 %Frozen parameter setting were applied to subsequent sample sizes to increase processing speed for larger datasets

Awaiting your reply

Best regards

Model Retraining after Deployment

Oracle

How to make your own lagged features

Google Ads use case

Feature Generation

Downloaded Predictions do not Match Targets