cancel
Showing results for 
Search instead for 
Did you mean: 

Time-series Time-steps: Regular vs Semi-regular

Time-series Time-steps: Regular vs Semi-regular

I want to forecast demand for a product. My Data is semi-regular as days when demand is zero no entry is made, therefore that day is missing in the data.  Looking at the Time Steps Docs I think my data meets the requirements of being semi-regular - "Data that is mostly regularly spaced". 

Will keeping the data semi-regular impact the accuracy of the models in DataRobot or should the missing days be imputed? 

 

Labels (1)
2 Solutions

Accepted Solutions

I would try it both with missing days and with imputed zeros. Whether your data is detected as regular or irregular depends on how many dates are missing and the pattern (or lack of pattern) in the missing dates. If it's just a relative handful of missing dates it will probably be detected as regular and it shouldn't impact accuracy. If the missing dates are very regular, e.g. always missing weekends, it will probably be detected as semi-regular and it shouldn't impact accuracy. If it is detected as irregular then you would have to model in row-based mode and this can impact accuracy. If that happens I would impute the zeros for this kind of problem.

View solution in original post

You should get a warning that looks similar to this. BTW, this screenshot shows a link, "Fix Gaps For Duration Based", to a new TS data prep tool that has just rolled out to work with datasets like yours.

Screenshot from 2021-11-12 07-55-46.png

 

After starting the project irregular datasets will have the feature derivation window and forecast window expressed as a number of rows in the UI.

Screenshot from 2021-11-12 08-01-46.png

View solution in original post

4 Replies

I would try it both with missing days and with imputed zeros. Whether your data is detected as regular or irregular depends on how many dates are missing and the pattern (or lack of pattern) in the missing dates. If it's just a relative handful of missing dates it will probably be detected as regular and it shouldn't impact accuracy. If the missing dates are very regular, e.g. always missing weekends, it will probably be detected as semi-regular and it shouldn't impact accuracy. If it is detected as irregular then you would have to model in row-based mode and this can impact accuracy. If that happens I would impute the zeros for this kind of problem.

thanks, do you know if the UI indicates if it has been picked up as irregular or regular?

You should get a warning that looks similar to this. BTW, this screenshot shows a link, "Fix Gaps For Duration Based", to a new TS data prep tool that has just rolled out to work with datasets like yours.

Screenshot from 2021-11-12 07-55-46.png

 

After starting the project irregular datasets will have the feature derivation window and forecast window expressed as a number of rows in the UI.

Screenshot from 2021-11-12 08-01-46.png

Awesome thanks @James Clemens 

0 Kudos