Demand Forecast—Multi-Series

cancel
Showing results for 
Search instead for 
Did you mean: 

Demand Forecast—Multi-Series

(Updated February 2021)

This article provides an end-to-end walkthrough of how to create a demand forecast with DataRobot Automated Time Series. Specifically, you’ll learn about importing data, target selection, as well as modeling options, evaluation, interpretation, and deployment. (If you want to see how you can use the API to do demand forecasting with multiseries data, see this notebook in the DataRobot Community GitHub.)

We are going to use this dataset from a company with ten stores to forecast demand for the next 30 days. In the dataset the stores are stacked on top of each other in a long format. As you can see the data has a number of variables with different variable types such as date, numerical, categorical, and text. Three variables need to be highlighted:

  • the Date column with days as the unit of analysis,
  • the Sales column, which is the target variable we want to forecast, and
  • the Store column, which contains the names of the different stores we will be forecasting.

Figure 1. Dataset in long format with stores “stacked” on top of each other, with a mixture of data types including date, text, categorical, and numericFigure 1. Dataset in long format with stores “stacked” on top of each other, with a mixture of data types including date, text, categorical, and numeric

Uploading dataset and setting options

To create a demand forecast model using DataRobot, you need to upload the dataset into DataRobot (new project page), and specify Sales as the target column. Then, you need to tell DataRobot that this is a time series problem by setting up time aware modeling, selecting the date field, and selecting time series modeling. DataRobot has detected that this is a multiseries dataset, and returns a list of potential variables to use for the series ID. In this case we will select Store, and click Set series ID (Figure 2).

Figure 2. Set up Time Aware Modeling, and set series IDFigure 2. Set up Time Aware Modeling, and set series ID

We need to tell DataRobot how far into the future we want to forecast, and how far into the past to go to create lag features and rolling statistics. We need to change the forecast distance to 1 to 30 days, and for now we will use the default feature derivation window (Figure 3).

Figure 3. Set forecast distanceFigure 3. Set forecast distance

Automated Time Series has a number of modeling options that can be configured. Here, we will look at the most commonly used options.

Backtests

Next is the option to partition the data (Show Advanced Settings > Date/Time tab).  With time series, you can’t just randomly sample data into partitions. The correct approach is backtesting, which trains on historical data and validates on recent data. You can adjust the validation periods, as well as the number of backtests to suit your needs. We will use the defaults for this dataset, as shown in Figure 5.

Figure 5. Backtesting and validation length optionsFigure 5. Backtesting and validation length options

Known In Advance Variables

For time series projects we have the ability to indicate features we will know in advance, so DataRobot can also generate non-lagged features for these variables. In Advanced Options, under the Time Series tab you can specify columns that will be known at the forecast point (Figure 4).

Figure 4. Declaring known in advance variablesFigure 4. Declaring known in advance variables

Event Calendar

DataRobot also allows you to provide an event calendar that will allow it to generate forward-looking features so that the model will be able to better capture special events. The event calendar for this dataset (Figure 6) consists of two fields: the date and the name of the event.

Figure 6. Event CalendarFigure 6. Event Calendar

To add the event calendar, scroll down a bit in the Time Series tab. Find the Calendar of holidays and special events section and add your calendar here (Figure 7). For non-multiseries projects: if you don’t have a calendar handy, you can have DataRobot generate one specific to a selected country code. The resulting calendar will include all relevant events during the time period of your dataset.

Figure 7. Adding the event calendarFigure 7. Adding the event calendar

There are many more options we could experiment with, but for now this is enough to get started.

Modeling

When we hit Start, DataRobot will take the original features we gave it, and create hundreds of derived features for the numeric, categorical, and text variables. It will then reduce the newly created features down as shown in Figure 8.

Figure 8. DataRobot has created many derived features from the original featuresFigure 8. DataRobot has created many derived features from the original features

After Autopilot (Full or Quick) completes we can examine the results of the Leaderboard, and evaluate the top-performing model across all backtests.

Also you’ll see that we ran all backtests and unlocked the holdout data. To do this, with your model selected from the Leaderboard click Run in the All Backtests column. From the worker panel, click Unlock project Holdout for all models.

Figure 9. Running all backtests and unlocking holdoutFigure 9. Running all backtests and unlocking holdout

Accuracy Over Time

In Figure 10 we can see the actual and predicted values plotted over time. We can also change the backtest and forecast distances, so we can evaluate the accuracy at different forecast distances across the validation periods.

Figure 10. Accuracy over timeFigure 10. Accuracy over time

Figure 11 shows the option to see the accuracy over time for each series, or to see the average across all series.

Figure 11. Accuracy over time with drop down by series, or averageFigure 11. Accuracy over time with drop down by series, or average

Forecast vs Actuals

On the Forecast vs Actuals tab (as shown in Figure 12), we can see what the forecast would be for any given forecast point in the validation period. This allows you to compare how predictions behave from different forecast points to different times in the future.

Figure 12. Forecast vs ActualsFigure 12. Forecast vs Actuals

Series Insights

The Series Insights tab provides the accuracy of each series based on the metric we choose, in this case RMSE (Figure 13). This is a good way to quickly evaluate the accuracy of each individual series. 

Figure 13. Series InsightsFigure 13. Series Insights

Stability

The Stability tab provides a summary of how well a model performs on different backtests to determine if it is consistent across time (Figure 14).

Figure 14. StabilityFigure 14. Stability

The Forecasting Accuracy tab explains how accurate the model is for each forecast distance (Figure 15).

Figure 13. Forecasting AccuracyFigure 13. Forecasting Accuracy

In the Feature Impact tab (under the Understand division, Figure 16) you can see the relative impact of each feature on your specific model, including the derived features.

Figure 16. Feature ImpactFigure 16. Feature Impact

The Feature Effects tab shows how changes to the value of each feature changes model predictions. In Figure 17 you can see that as Sales (nonzero) (35 day average baseline) increases, Sales (actual) also increases, proportionally.

Figure 17. Feature EffectsFigure 17. Feature Effects

Prediction Explanations

Prediction Explanations explain why your model assigned a value to a specific observation (Figure 18).

Figure 18. Prediction ExplanationsFigure 18. Prediction Explanations

Predictions

Now that we have built and selected our demand forecast model, we want to get predictions. There are two ways to get time series predictions from DataRobot. 

The first is the simplest: you can use the UI to drag-and-drop a prediction dataset (Figure 19). This is typically used for testing, or for small ad-hoc forecasting projects that don’t require frequent predictions.

Figure 18. Predictions, drag-and-dropFigure 18. Predictions, drag-and-drop

The second method is to deploy a REST endpoint and request predictions via API (Figure 20). This connects the model to a dedicated prediction server and creates a dedicated deployment object.

Figure 20. Deploy model to prediction serverFigure 20. Deploy model to prediction server

More Information

Have a look at this DataRobot Community GitHub notebook which walks through demand forecasting with multi-series data.

If you’re a licensed DataRobot customer, search the in-app Platform Documentation for Time series modeling or Multiseries modeling.

Labels (2)
Version history
Last update:
‎02-10-2021 02:29 PM
Updated by:
Contributors