Thanks to everyone who joined us for our first DataRobot Live! We reviewed the model capabilities of our platform as well as a demo on building production ready models. A transcript of the Q&A is outlined below.
Q1:You mentioned Pathfinder [has] regular use cases, where can I find this?
A1: Pathfinder can be found at pathfinder.datarobot.com. This includes many use case ideas from common industries, some of which are fleshed out to include business considerations and implementations, and even notebook solutions!
Q2: I am new to DataRobot and I am working with a Time Series dataset that only uses weekdays, is there a way to include weekends as well?
A2: DataRobot has a Time Series DataPrep tool that allows you to aggregate to the weekly level easily if you think weekly predictions are granular enough. If you want to stick to daily, you can include those weekends with zero value outcomes, and DataRobot will automatically generate indicators for each day of the week. It will quickly learn that there are zero-value outcomes on weekends.
Q3: Is there a common way to analyze just weekdays? I know that there are ways to build SQL code to use the Time Series stuff with gaps.
A3: Following up on a similar question here, you can access time series data prep to further clean your time series data (or use the recommendation provided above). Learn more here.
Q4: I have seen many different examples/ iterations of Time Series. My current example is to predict One Day Ahead. Any examples that can be shared on just that alone?
A4: As shared live, when you build time series models with DataRobot's AutoTS, you can choose the "forecast window". This could be just the next day, or the next week, or between 4-7 days away from the prediction time. There is a lot of flexibility here.
Q5: What are the steps to collect the information based on the past - I think I saw a video of including future dates, but input them as blank, and the program automatically estimates the missing dates, one day ahead.
A5: Time Series (and feature discovery) do automatic feature engineering in which they derive rolling metrics over the past day, week, month, or any custom time frame. To make future predictions, you do need to provide a dataset with blank outcomes in future dates. If you have multiple series, then you need blanks on future dates for each series.
Q6:Does the model retrain itself automatically when it is fed with new data entries?
A6: It won't retrain itself by default, but you can set up automatic retraining jobs in MLOps. You can retrain based on triggers like a decline in model performance, or you can retrain on a schedule. Learn more here.
Q7: What dataset size can be used with DataRobot? (million data points? and max number of features?)
A7: Data size limits are mostly based on file size. Depending on your license and/or install (if on premises), you can model on 5 GB or 10 GB of data. There is also a feature limit of 20,000 features, but the size limit overrides this one. When dealing with big data, we recommend downsampling the data.
If you have any feedback on answers or other questions, please feel free to comment below. You can join us for an upcoming session by registering here.