I am running a multi-series model with each series having approximately 30 rows of data each. I noticed that the accuracy scores suggest that I run the two different series in separate projects. On breaking up the data and running the separate projects I noticed that the volume of data does not meet the minimum requirements of 35 rows.
I'm just wondering why I can run a multi-series process even though the volume of data for each series is less than 35 rows? I'm GUESSING DataRobot is looking at the combined dataset, i.e. 60 rows, to derive "average features" that are applicable to both series, but maybe some clarity on the process will help me better understand the problem I'm having.
HI @lhaviland, I have seen this article previously, and although this is a really useful article that helped me through the first models I ran, I don't think there is anything in there that sheds light on the issue I'm currently having
Hi @Shai ,
Time Series data size limitations are comprised from the following restrictions:
1. We need at least 20 data points to train models.
2. We need at least 4 data points to validate models.
3. We need to have at least some amount of history for feature derivation, hence we require additional 11 points for that.
When you combine two series into a multi-series project with 60 rows, the training data and the validation data are combined from two series, allowing you to have >=20 training rows and >=4 validation rows. DataRobot Time Series in multi-series mode doesn't train series separately, most of our models take advantage of multi-series learning, hence it is not a problem to have less than 20 training rows or less than 4 validation rows for a single series, but it becomes a problem when you try modeling your series separately.