How DataRobot can maximize metrics for binary classification

cancel
Showing results for 
Search instead for 
Did you mean: 


How DataRobot can maximize metrics for binary classification

DataRobot can derive many metrics for binary classification from a confusion matrix, including: Accuracy, Balanced Accuracy, F1 Score, Cohen's kappa coefficient, Matthews correlation coefficient, etc. Note that each of these metrics focuses on the binary output from the model and so cannot be considered probabilistic.

DataRobot concentrates on building models that produce well-calibrated probabilities. For example, when a well-calibrated model predicts .80, the event should occur 80% of the time; when a well-calibrated model predicts .20, the event should occur 20% of the time. DataRobot optimizes LogLoss, which results in models that produce well-calibrated probabilities.

After modeling, you can choose a decision threshold to maximize metrics such as F1 Score, Balanced Accuracy, Cohen's kappa coefficient, Matthews correlation coefficient, etc. Although DataRobot reports the decision threshold that maximizes F1 Score (as shown in the ROC curve page), you can apply that information when selecting decision thresholds that optimize other metrics.

You can export the ROC curve as a CSV file to review calculated metrics.

kb-rcfds-maxmetric-1.png

Version history
Last update:
‎04-10-2020 07:41 AM
Updated by:
Contributors