One of the questions we get often is, “What are the key components of the DataRobot architecture?” In this article, I’ll describe each component and how they all work together to create our scalable enterprise AI platform.
The Application Server houses all of the main administrative components. It handles authentication, project management, and user administration, and provides an endpoint for our API. It also manages the queue of modeling requests made by various projects, which are executed by Modeling Workers running on modeling nodes. A single modeling node can host multiple modeling workers (i.e., portions of processing power for performing tasks) and a single DataRobot system can have multiple modeling nodes.
When you purchase DataRobot you get access to a set number of concurrent modeling workers. In general, one modeling worker can train one model or generate one additional insight option (for example, Feature Impact, Predictive Explanations, etc.). Modeling workers operate in parallel, allowing users to simultaneously create multiple models and generate additional insights. This helps minimize the time required to train all the models associated with a particular project.
DataRobot is architected to ensure the amount of data a customer can possibly use to train a model is limited only by the physical hardware resources allocated to the DataRobot environment. Modeling workers require varying amounts of machine resources so they can operate and complete their tasks successfully. Base modeling workers are assigned a specific CPU and memory limit (typically 4 CPU cores and 30GB RAM) that allows them to build models with training datasets up to 1.5GB in size (uncompressed size on disk), and 10 classes for multi-classification applications.
Flexible modeling workers, in contrast, can scale from 4 CPU cores up to 20 CPU cores, and utilize much larger memory allocations, to process significantly larger training datasets and up to 100 classes for multi-classification applications. DataRobot dynamically adds computing power and memory as data size or problem complexity increases. The capacity that a flexible modeling worker can scale to is limited by the amount of available resources on the modeling node.
Modeling workers are also stateless, so they can be configured to join and leave the environment on demand. This can save on hardware costs when configured with a VPC. Within a Hadoop cluster, these workers are YARN containers.
Trained models are written back into the Data Layer, and their accuracy is reflected on the model leaderboard through the application server.
Predictions can be generated in batch by utilizing the Web UI to upload data, or via an API endpoint. In both of these cases, a modeling worker is temporarily used to generate predictions. For low-latency and high-capacity prediction environments, prediction cores are recommended. Prediction Cores are reserved for making predictions to avoid contention with modeling activities. This system of reserving prediction cores effectively ensures that prediction resources are available when a particular user wants to generate predictions and that modeling users don't have to wait while modeling resources are consumed for predictions. Furthermore, key statistics about those predictions and the data provided is returned back to the application server and displayed to users for monitoring the health of the models, including an analysis of data drift and accuracy. Prediction cores can also be deployed in an environment disconnected from the DataRobot platform, allowing enterprises to deploy models in segregated networks.
I hope this served as a good introduction to the various components of a DataRobot environment. If you have any questions, please drop us a note below.