Gut Check the Recommended Model

Showing results for 
Search instead for 
Did you mean: 

Gut Check the Recommended Model

When you hit Start (as part of the previous step, Evaluate Data and Select Target) DataRobot kicked-off a high-speed battle to find the algorithms that best suit the needs of your particular target and dataset. You may know that most data scientists specialize in a particular programming language, R or Python or SAS, for example. Imagine hundreds of data scientists all in a room together, each with their own particular kind of expertise, duking it out to find the best algorithm for the situation. That’s exactly what is happening here, except that DataRobot does it in a matter of minutes instead of days or months.

Before we even take a look at a model, we can learn something. Go to the Data tab and take a look at the Importance column; it shows us how closely correlated the column is to the target. We can see that the Vendor column, which contains the name of the vendor, is the most important feature vis-à-vis whether the delivery will be late.



On my computer, it was about 15 minutes between the moment I hit Start and when the automated model building battle ended. During the process, grab a cup of coffee, hit another Zoom meeting, or watch the next video (Consume New Insights)! DataRobot will notify you when Autopilot is complete. (If needed, you can change that in your Settings under your profile icon in the top right corner of the page). 

Let's go to the Models tab and check out the models Leaderboard, which lists the built models in order of best to least recommended. As you see, DataRobot identifies the top-ranked contender for our AI as eXtreme Gradient Boosted Trees Classifier with Early Stopping


Let’s say you are most familiar with R models. If you scroll down a bit, you'll see that DataRobot put an R model on the Leaderboard: Gradient Boosted Trees Classifier.


Let’s compare the two models and see why DataRobot recommends the model it selected. Navigate to the Models tab and select Model Comparison. From the Model Comparison tool, you can use the dropdown to compare any of the models on the Leaderboard. See that the recommended model was auto-populated. Now choose the eXtreme Gradient Boosted Trees Classifier with Early Stopping (model ID M7 BP85) from the dropdown.



Pulling up the recommended model and the R model, you can see that the LogLoss validation score is better for the recommended model (hint: LogLoss is the measure of the degree of accuracy, and a lower score is better). You can also see various other ways to compare the models.

If you know the monetary values associated with accurate and inaccurate predictions, you can assign them in the Profit Curve section of the Model Comparison Tool and see the costs or profits associated with inaccurately or accurately predicting that a shipment will be on time, or vice versa.

Now let's go back to the Leaderboard and select the recommended model to explore it a bit more. DataRobot automatically navigates you to the Describe > Blueprint tab to show the blueprint (i.e., the recipe for the model). You can see each action that the algorithm is performing on your data here.



Our model has a lot to tell us. Let’s first jump to Feature Impact (Understand > Feature Impact tab) and click Compute.



We see another automated alert telling us DataRobot has detected redundant features. Scrolling down, you can see the flagged term, "Vendor INCO Term." As you recall from when we looked at features with missing values (in Evaluate Data and Select Target), the "Vendor INCO Term" had the most missing values in the dataset. In the Feature Impact display we can see that the Vendor INCO Term feature is redundant with the Fullfill Via feature. This is further assurance that that it was okay to not supplement the dataset in the pre-modeling stage. We know from the data dictionary that Fulfill Via is the method by which the shipment was made, either directly from the vendor or through the Regional Distribution Center.

In the feature chart we can see that, while our model is assessing whether a shipment will be late, it determines the most important features are destination Country(of origin), Vendor, and a variety of price metrics. We also see that the features which are not driving the timeliness of shipments, listed at the bottom, include Managed By.  According to the data dictionary, this feature reflects which office, US or field, is responsible for the delivery.

When you are mapping out a strategy about how to reduce late shipments, Feature Impact helps you figure out where to focus your efforts in order to make the greatest impact.

Next let’s visit Feature Effects (Understand > Feature Effects tab). DataRobot looks at your data, ignores the known information about whether the delivery was late or not, and makes its prediction based on the way it’s evaluating the shipments. We want to see how closely DataRobot’s prediction lined up with what actually happened, so select Predicted and Actual, and deselect Partial Dependence.

This Feature Effects display helps give you additional comfort about choosing the recommended model.



In this chart comparing destination countries, you can see that shipments to Burundi have the highest likelihood of being late. You can scroll through the other top features and get other ideas on how to make deliveries more timely. For example, when we look at Vendor, we can see that the Regional Distribution Centers, or RDCs, are the source of the most late shipments. An interesting insight, right? If the centers are in-region then why are they late, and what can be done about it?

Let's move on and check out some other ways you can Consume New Insights about your data.

More information

What Features are Important to my Model? (Feature Impact Overview)
Evaluating the Leaderboard
Comparing Models Overview
(Also, see the introduction to all articles in this series) 

Version history
Last update:
‎10-05-2020 03:35 PM
Updated by: