I am building multi-series time-series forecast model.
Data has 200 combination of user-product(series-Id). Few product has sales history from 2019 and few has sales history from 2021.
How do I choose my forecast window and feature derivation window ?
Business expect forecast for future 18 months.
Can I set 3 months to feature derivation window and 17 months to forecast window ?
Is there any guidelines or Thumb-rule to follow to set these two values ?
Q: How do I choose my forecast window and feature derivation window?
A: Forecast window depends upon your use case requirements. Feature derivation window can be experimented with. You can try building multiple projects with different allowed Feature derivation windows and see which one provides better performance.
Q: Can I set 3 months to feature derivation window and 17 months to forecast window ?
A: Yes, as long as DataRobot allows it. There are cases where DataRobot might not allow certain feature derivation windows and forecast windows if there are a lot of series in the dataset which do not have enough samples to allow the selected feature derivation and forecast windows.
Q: Is there any guidelines or Thumb-rule to follow to set these two values ?
A: Experiment with various windows and see what works best. Generally there are some tips to improve model performance on long forecast windows like building different projects to forecast 1-6, 7-12, 13-18 months, split your projects into series which start in 2019 and which start in 2021 etc.,
Hope this helps.
You can, but as mentioned before it's definitely trying different feature derivation windows to see which one works best.
Also, forecasts that far in advance (~17 months) can get fairly speculative. DataRobot will often use a lower maximum threshold, depending on the data, and you may need to use some of the predicted values to forecast this far ahead (or beyond).
Hi @Lokeshwaran, as my colleagues Abdul and Felix have recommended, the Feature Derivation Window (FDW) and Forecast Window (FW) are highly dependent on the use case, product behavior, etc. and one should really test different values for best results.
What you can do may not necessarily be what you should do, and as Felix suggested, having a large FW in relation to the FDW can get rather speculative.
Would you be concerned if the periodicity is not captured in the FDW you've set but is within your FW?
With 200 or more user products/series ids would it be better to segment your products?
Would it be better to create a hierarchical structure of your product and perhaps develop a hierachical model for better results?
These considerations and topics are covered in the DataRobot University instructor-led Time Series Modelling class ( https://university.datarobot.com/datarobot-time-series-modeling ).