This tutorial walks through the process for classifying images with DataRobot using R.
At the end of this tutorial you will end up with an image classification project and a Markdown document that contains an analysis of the best model as well as a shiny app you can use to make predictions on new images (Figure 1).
Specifically, you are going to classify images of droids from Star Wars. This open dataset on GitHub contains images of R2-D2 and BB8; you're going to use these images to create a classification model.
Figure 1. Application
After you test the app on whether or not it would predict the droids correctly, you can push the limits a bit and try out some other droid-like images. For example, we tried out images of dogs in droid costumes and other images and found the classifier generalized well to these different contexts. This is a fun way to determine whether or not the model is overfit. As you can see, the model built with Visual AI passed with flying colors. We included some fun test images in the zipped folder, “Fun Test Images.zip.”
Once you create the app you will be able to see the deployment and track predictions on the Deployments tab (Figure 2).
Figure 2. Deployment
There are a few things you are going to need to complete this tutorial.
Your API Key: If you go up to the top right of the browser to your avatar and select Developer Tools. This will take you to a page for viewing and managing your API tokens.
Your username: This is your login for DataRobot.
Your Endpoint: This is where R and DataRobot connect.
Or, if you don’t fall into either category then ask your system administrator.
Your DataRobot Key: This is a key needed to use the prediction server.
You can find this in the Python code on the Integrations tab of one of your deployments. If you don’t have a deployment then ask your system administrator.
The first thing you're going to do is unzip the folder called “droids demo.zip.” You will see a Markdown document along with two images. If you open the Markdown file, then you will find the R code accompanied by the explanatory text. This is what you will use to build the model and prediction app. You can find this on the DataRobot Community Github.
This Markdown file will show you how to set up your credentials, connect to DataRobot, and get the files necessary for the modeling using R. You should end up with a setup that contains two zipped folders of images for train and test data (Figure 3). The train_folder will have two subfolders: one of images for R2-D2 and one for images of BB8.
Figure 3: Folders
You will also see code that allows you to run a Visual AI project, using zipped folders of images, and getting and plotting important metrics like ROC using GGplot. The rest of this post will highlight some key parts of the code.
After you run the document once and have a completed project, you can comment out the StartProject code and simply connect to the completed project using project ID. You can find this ID in the URL of your completed project. This is all you need to do to get DataRobot to build many image classification models.
project <- GetProject('project ID')
Get and Plot ROC Curve
When you have decided which model to use, you can download model information and even plot it using GGpplot. The snippet shown here grabs the ROC curve and creates a simple plot. With the R DataRobot package, you can plot anything you see in DataRobot with GGplot.
There is an example of how you use R to make a prediction using images and the prediction server. For this example, we show how to access DataRobot using our “raw” API; this shows that you can work with DataRobot from any language (and multi-language support is important when, for example, your IT department prefers C++ or Java).
r <- POST(URL, body = upload_file(“path/image_bas64text.csv”), type = "text/plain; encoding=UTF-8",
add_headers("datarobot-key" = datarobot_key),
authenticate(username, token, type = "basic"))
body <- fromJSON(content(r, "text", encoding = "UTF-8"))