I want to forecast demand for a product. My Data is semi-regular as days when demand is zero no entry is made, therefore that day is missing in the data. Looking at the Time Steps Docs I think my data meets the requirements of being semi-regular - "Data that is mostly regularly spaced".
Will keeping the data semi-regular impact the accuracy of the models in DataRobot or should the missing days be imputed?
Solved! Go to Solution.
I would try it both with missing days and with imputed zeros. Whether your data is detected as regular or irregular depends on how many dates are missing and the pattern (or lack of pattern) in the missing dates. If it's just a relative handful of missing dates it will probably be detected as regular and it shouldn't impact accuracy. If the missing dates are very regular, e.g. always missing weekends, it will probably be detected as semi-regular and it shouldn't impact accuracy. If it is detected as irregular then you would have to model in row-based mode and this can impact accuracy. If that happens I would impute the zeros for this kind of problem.
You should get a warning that looks similar to this. BTW, this screenshot shows a link, "Fix Gaps For Duration Based", to a new TS data prep tool that has just rolled out to work with datasets like yours.
After starting the project irregular datasets will have the feature derivation window and forecast window expressed as a number of rows in the UI.