This article showcases how you can download a DataRobot model as a JAR dependency-free file that can be used to score data outside of the DataRobot platform. More specifically, we will use the JAR file from the command line.
DataRobot allows you to download scoring code for some models on the Leaderboard. This can be in the form of a java source code or binary file. The latter comes packaged with all dependencies and can be immediately used to make predictions outside of DataRobot. The former is a non-obfuscated version of the model you see on the Leaderboard tab. It is not ready for use without incorporating all its dependencies and compiling it. You might be interested in the source code if you want to explore the model’s decision-making process. Not all models on the Leaderboard have an associated scoring code.
In order to download the scoring code, navigate to the Leaderboard tab, find models with the “scoring code” tag, and identify the model whose scoring code you want to download (Figure 1). Each scoring code always produces the same predictions as the model on the Leaderboard. In order to ensure that this is always true, DataRobot creates Java code based on the exact preprocessing steps found in the model on the Leaderboard. Then, it tests the functionality of the resulting Java code against the same validation dataset that was used in the original model and confirms that the performance of both are within 0.00001 of each other.
Figure 1. Leaderboard with models that have a scoring code tag
After you have identified the model with the scoring code that you are interested in, click on its Predict tab followed by its Downloads tab (See Figure 2). Now, you can download the Scoring Code JAR as either a binary file or source code.
Figure 2. Model downloads section
After downloading the binary file, you can immediately use it to score new data. If you use your command line to execute the binary file, the complete syntax for doing so would be similar to the following:
For this example, the results are saved to the readmissions_test_output.csv file while the input is provided in the readmissions_test.csv file. Besides using the binary file for scoring new data on the command line, you could also use it in other business processes that can integrate with Java JAR files.