I have an xgboost based model for time series analysis. The training dataset is composed of datetime and daily production value.
When I want to predict the next few days, is it possible to input the model with the production values detected in the previous days in order to have a more accurate forecast?
Solved! Go to Solution.
The thumb rule is - Any feature you think can give signal or is informative for better prediction should be included as part of model's features as long as you have access to these features during inference.
We usually give a gap from our prediction point in case of a time series problem. This period of gap is called "Blind History Gap". And, is used to make sure that the model is not just simply a function of most recent value but learns from historic time series signals in the data.
I would suggest if you can try two projects - With (-30, -7) and (-30, -3) as your Feature Derivation Windows to see if there is any significant difference in your chosen metric. Choose the window which is practically, accounts for delay in data and scores are stable across different partition sets.
Hope this helps.
ok but the info I would need is: is it possible to give the model published in input also the latest true values to have a finer forecast?
Yes. All you need is create a new feature called as lag_target which lags the target by 1 day duration in this case.
And, during production - You would pass a dataframe with the latest true value (previous day's actual) in column lag_target .
Let me know if you need any help further. Thanks!
Sure, here's an example.
We have our Date and Sales as our target and our targetlag (-1 day)
And for the 2022-06-27 - You would send the prediction input as below
Hope this helps.