While the approach used at DataRobot is sufficient in most cases, you may want to build upon DataRobot's Autopilot modeling process by custom tuning methods. In this AI Accelerator, you will familiarize yourself with DataRobot's fine-tuning API calls to control DataRobot's pattern search approach as well as implement a modified brute-force grid-search for the text and categorical data pipeline and hyperparameters of an XGBoost model. This notebook serves as an introductory learning example that other approaches can be built from. Bayesian Optimization, for example, leverages a probabilistic model to judiciously sift through the hyperparameter space to converge on an optimal solution, and will be presented next in this accelerator bundle.
Note that as a best practice, it is generally best to wait until the model is in a near-finished state before searching for the best hyperparameters to use. Specifically, the following have already been finalized:
- Training data (e.g., data sources) - Model validation method (e.g., group cross-validation, random cross-validation, or backtesting. How the problem is framed influences all subsequent steps, as it changes error minimization.) - Feature engineering (particularly, calculations driven by subject matter expertise) - Preprocessing and data transformations (e.g., word or character tokenizers, PCA, embeddings, normalization, etc.) - Algorithm type (e.g. GLM, tree-based, neural net)
These decisions typically have a larger impact on model performance compared to adjusting a machine learning algorithm's hyperparameters (especially when using DataRobot, as the hyperparameters chosen automatically are pretty competitive).
About this Accelerator
This AI Accelerator teaches you how to access, understand, and tune blueprints for both preprocessing and model hyperparameters. You'll programmatically work with DataRobot advanced tuning which you can then adapt to your other projects.
What you will learn
You'll learn how to:
* Prepare for tuning a model via the DataRobot API - Load a project and model for tuning - Set the validation type for minimizing error - Extracting model metadata - Get model performance - Review hyperparameters * Run a single advanced tuning session * Implement your own custom gridsearch for single and multiple models to evaluate