cancel
Showing results for 
Search instead for 
Did you mean: 

Calendars in Time Series Forecasting

Blue LED

When setting up a time series model you have the option to include a calendar from which additional modelling features will be created:

Screenshot 2021-01-04 at 12.37.19.png

 

From the screenshot below, it can be seen that the sales on Christmas Day for both 2012 and 2013 are are zero, and that there is no difference in behaviour around black Friday.

Screenshot 2021-01-04 at 12.42.24.png

 

However, when a model is trained on this data and a Calendar file is included that contains dates for both Black Friday and Christmas Day, and then a forecast is made for the November-December Period of 2014 the predictions show a dip in sales on Black Friday:

Screenshot 2021-01-04 at 12.46.35.png

 

This is unexpected and suggests that Datarobot is learning patterns about Christmas and is incorrectly applying that pattern to other calendar events. From looking at the prediction explanations, the driving feature for the dip in sales on black friday is "Date (days to next calendar event) (actual)", further supporting this case.

Would it make sense for "Date (days to next calendar event)" to be computed for each type of event so that patterns with one event are not applied to another?

Labels (1)
4 Replies
Data Scientist
Data Scientist

Hi @T1 ,

A few quick questions upfront:
1. When you included the calendar, does it include events throughout your training data and into the (relative) future from the end of your training data?
2. How did you label the holidays in the calendar file? Did you label them as 'Black Friday' or 'Christmas' or just 'Holiday' in the Calendar file? Are they labeled the same for the respective holiday throughout the training and prediction periods?

If the holidays are named uniquely, then DataRobot will treat 'Christmas'-derived features as separate from 'Black Friday'-derived features. There will still be some holiday-name independent features created ('days since holiday...', 'days to holiday...'), but I wouldn't expect for there to be confusion between the different holidays if they are given distinct names.

Blue LED

Hi @jarred ,

Thank you for getting back to me. Please see my response:

1. When you included the calendar, does it include events throughout your training data and into the (relative) future from the end of your training data?

Yes, the calendar includes events for both the training data and the prediction period.


2. How did you label the holidays in the calendar file? Did you label them as 'Black Friday' or 'Christmas' or just 'Holiday' in the Calendar file? Are they labeled the same for the respective holiday throughout the training and prediction periods?

Black Friday and Christmas were both labeled seperately.

If the holidays are named uniquely, then DataRobot will treat 'Christmas'-derived features as separate from 'Black Friday'-derived features. There will still be some holiday-name independent features created ('days since holiday...', 'days to holiday...'), but I wouldn't expect for there to be confusion between the different holidays if they are given distinct names.

I have now compared the prediction explanations for Black Friday and Christmas and the holiday name independent feature ('days from previous caledar event', 'days to next calendar event') are present in the top 3 explanations for both holiday events. So it would appear that the holiday independent features are the cause.

Data Scientist
Data Scientist

@T1 Thanks for following up. That's an interesting situation. I'd be happy to dig a bit deeper if you're willing to send me the dataset, but I certainly understand if you are unable to share it. If so, please email at jarred.bultema@datarobot.com

As an immediate next-step, I'd suggest you remove any holidays that you know are irrelevant from your calendar file (such as 'Black Friday' and re-run the project with the new calendar file). While generally we don't observe 'low-information' holidays as detracting from model performance, it appears to do so in this case. I  would expect that removing the holiday entry will remove the black-friday dip, but please let me know if it does not.

0 Kudos
Blue LED

Thank you for the suggestion. I am confident believe removing Black Friday from the Calendar would work.

As the project is not sensitive, I have shared it with you on the platform. I will also email you the CSV used for predictions.

 

0 Kudos