When you upload your training data, DataRobot automatically performs Exploratory Data Analysis or EDA to understand the data.
In Figure 1, we have a dataset already uploaded. In the Data Quality Assessment report, we can see this dataset has 51 features and 10,000 records, and that no initial downsampling was done.
Figure 1. Data fields and Data Quality Assessment report
DataRobot provides the data type along with the number of unique and missing values for all fields. For numeric fields it also provides the mean, standard deviation, median, min, and max values.
Figure 2. Summary Statistics
When you click on a numeric field, DataRobot shows a histogram for that data. You are able to adjust the number of bins that are displayed so that you can see a more granular, or aggregated, histogram.
Figure 3. Field Histogram
You are also able to see the frequent values and their counts by clicking Frequent Values.
Figure 4. Frequent Values
If you click Table, DataRobot will present the field in a tabular format, sorting it by the “Count” in descending order.
Figure 5. Table of values
When drilling down into a categorical field, you are able to view the frequent values and table formats. DataRobot also creates a bucket if your field has missing values, as seen in Figure 6.
Figure 6. Categorical values
DataRobot also enables you to drill into text fields the same way as a categorical field.
Figure 7. Text values
If you’re a licensed DataRobot customer, search the in-app Platform Documentation for Overview and EDA and Time series modeling.