cancel
Showing results for 
Search instead for 
Did you mean: 

Time step is irregular in Time Series

Time step is irregular in Time Series

Hello

I got the message "Time step is irregular".

My dataset is 'Bike sharing demand' in kaggle.

(https://www.kaggle.com/competitions/bike-sharing-demand/overview)

 

My target is 'count' and date column is 'datetime'.

What is meaning of this alert?

 

스크린샷 2022-06-22 오전 1.19.13.png

Labels (1)
3 Solutions

Accepted Solutions

Hi @cookie_yamyam ,

Thanks for your question. The reason for this notification is that there is a gap in the time series data.

 

From the competition description, I can see that the training data is only for the first 19days of the month and the test period is from the 20th to the end of the month. You can go ahead and use row-based time series here as the problem statement is predicting part of the month.

 

But in some cases when the data is not available and if you want to generate those data points then you can use the inbuilt data prep tool for time-series which will help you fix those missing data points - Refer to this doc link for the same - TS Data Prep. Let us know if this answers your question and if yes please mark this as the accepted solution.

View solution in original post

dalilaB
DataRobot Alumni

To ensure that your dataset is a time-series with regular steps, just upload your dataset to AI Catalog then got to hamburger icon and choose Prepare data for Time Series

Screen Shot 2022-06-24 at 9.34.47 AM.png

Then fill up this screen and click run 
Screen Shot 2022-06-24 at 9.36.44 AM.png

 

Here is an already filled screen as an inspiration  that may help you

Screen Shot 2022-06-24 at 9.48.14 AM.png
After clicking on run, you will notice that a new name is created for the new cleaned dataset, and a Spark SQL code is generated
Screen Shot 2022-06-24 at 9.49.37 AM.png

The Spark SQL can be used in AI Catalog Workspace pipeline

 

View solution in original post

0 Kudos

Hi @cookie_yamyam,

Yes, you can try it as a multi-series with the month as the series identifier to see if it improves the model performance. Do note that the above-mentioned dataset has 2 years of data, so you need to mention the year also in the series id. For eg: something like Jan2011, Feb2011...Dec2012

Thanks for flagging the data prep link. I am attaching the updated link here .

View solution in original post

0 Kudos
4 Replies

Hi @cookie_yamyam ,

Thanks for your question. The reason for this notification is that there is a gap in the time series data.

 

From the competition description, I can see that the training data is only for the first 19days of the month and the test period is from the 20th to the end of the month. You can go ahead and use row-based time series here as the problem statement is predicting part of the month.

 

But in some cases when the data is not available and if you want to generate those data points then you can use the inbuilt data prep tool for time-series which will help you fix those missing data points - Refer to this doc link for the same - TS Data Prep. Let us know if this answers your question and if yes please mark this as the accepted solution.

dalilaB
DataRobot Alumni

To ensure that your dataset is a time-series with regular steps, just upload your dataset to AI Catalog then got to hamburger icon and choose Prepare data for Time Series

Screen Shot 2022-06-24 at 9.34.47 AM.png

Then fill up this screen and click run 
Screen Shot 2022-06-24 at 9.36.44 AM.png

 

Here is an already filled screen as an inspiration  that may help you

Screen Shot 2022-06-24 at 9.48.14 AM.png
After clicking on run, you will notice that a new name is created for the new cleaned dataset, and a Spark SQL code is generated
Screen Shot 2022-06-24 at 9.49.37 AM.png

The Spark SQL can be used in AI Catalog Workspace pipeline

 

0 Kudos

Thank you for the answer!

In this case, how can I solve the problem?

If I add a categorical column to identify months (like 1,2,3,..) for using 'Series ID' in DataRobot, is it possible to run multiseries model?

 

(TS Data Prep link is broken.)

0 Kudos

Hi @cookie_yamyam,

Yes, you can try it as a multi-series with the month as the series identifier to see if it improves the model performance. Do note that the above-mentioned dataset has 2 years of data, so you need to mention the year also in the series id. For eg: something like Jan2011, Feb2011...Dec2012

Thanks for flagging the data prep link. I am attaching the updated link here .

0 Kudos