Turning Raw Predictions into Decisions with an API Wrapper
Are you dealing with stacked predictions or pipelines where you have to score 10+ different models and consolidate results based on a predefined business logic? Or, are you dealing with frequently changing business logic?
These are just a few of the scenarios where a simple but versatile API wrapper can help with turning raw predictions into actual decisions. This tutorial explains how you can implement a "decision engine" using an API wrapper. As you'll see, the process is quite straightforward:
Create and deploy a model.
Install the Decision Engine.
Configure the Decision Engine.
Generate decisions instead of just predictions.
For this tutorial it is assumed you have a basic knowledge of Docker, Python, and Django.
We will deploy our Docker image locally on a Linux server for this tutorial, but you could just as easily deploy it to AWS EKS or Azure AKS or, alternatively, take the same logic and implement it with a serverless architecture such as AWS Lambda or Azure Functions.
Before you can use the Decision Engine, you need to complete the configuration. To do so, open a browser of your choice and navigate to http://127.0.0.1:8000.
You will see the DataRobot Decisions GUI.
Enter the previously specified username and password (from Install the Decision Engine) and click Login.
In the displayed DataRobot Decisions - Decision Engine admin page, finalize the configuration as explained below.
Set up a logical entity as an abstraction layer.
Specify a name, for example ‘LoanA’.
Add pre- and post-processing business logic and link it to the previously created logical abstraction layer.
Specify the name, for example “LoanALogic,” and paste the Python sample code shown below.
# -*- coding: utf-8 -*-
Created on 2020/07/09
@author: Felix Huthmacher
import pandas as pd
## data preparation / pre-processing business logic
# e.g. enhance web service-input with some other features
features_df = features_df
features_df['FICO'] = '850'
# Specify deployment id / model for scoring based on a certain feature/business entity
# you could specify a different deployment for each row in the dataframe
features_df['deployment_id'] = features_df['sub_grade'].apply(lambda x: <REPLACE WITH YOUR DEPLOYMENT_ID> if x == 'A2' else '<REPLACE WITH YOUR DEPLOYMENT_ID>')
## post-processing business logic
# e.g. different thresholds based on geographic area
df_new['decision'] = df_new['addr_state'].apply(lambda x: '1' if x == 'CA' else '0')
return df_new['decision'], df_new
Make sure to update the code snippet with the corresponding deployment ID that you created in step 4 (Create and Deploy a Model).
The method data_prepare allows you to add pre-processing steps such as data enrichment, feature engineering, or duplicated input rows for scoring against multiple models in parallel. Each row in the dataframe can be pointed to a different deployment ID / model for scoring based on bespoke business logic.
The method business_logic allows you to consolidate scoring results based on predefined business logic. For example you can define different probability thresholds based on geographic data /customer segments, or roll up results from multiple models before returning results. This way you can return decisions rather than just raw prediction results, which simplifies integrations with downstream systems.
Specify the prediction server instance
The last step is to specify the prediction server instance and credentials that we want to use for our predictions.
For this we click Change and then specify the name, server URL, Datarobot Key (only required for a DataRobot Managed AI Cloud deployment), username, and API token, as well as the default logic connector. (All connection details and credentials can be found in the sample code from step 4 Create and Deploy a Model.)
Click Save when done.
Generate decisions instead of just predictions
Now that we have completed the configuration, we can use our Decision Engine. You can download the Postman collection that includes a sample REST and SOAP request from here.
By default the Decision Engine supports basic authentication.
The username and password can be configured in the Django settings.py here as shown below.
DataRobot natively supports a REST API, but you can easily convert this Decision Engine (i.e., API Wrapper) to a SOAP API as shown below. Input and Output structure can be adjusted as needed.
Example SOAP request
Example REST request
Now that we have created our Decision Engine (i.e., API wrapper), we can turn our raw predictions into actionable decisions. Additionally. We can encapsulate business logic and put governance around it through logging and versioning. Security is important, thus not everyone can change the business logic and altering business logic or generating decisions requires authentication. Because security is important, the decision engine has restrictions for changing business logic and altering business logic, and requires authentication to generate decisions.
Every change can be logged, and as soon as a particular business logic has been used to generate decisions, it cannot be altered; instead, users have to create new versions.
If you have followed any of my previous tutorials when you already know what is coming next.
Because we are leveraging the DataRobot Prediction API, we automatically benefit from its built-in monitoring functionality; this enables us to monitor a model’s performance and benchmark it against other models. Also it allows us to replace the model at any point in time without having to write or change a single line of code.
Finally, this sample code can also easily be modified to work with different different scoring methods such as portable prediction servers and scoring code, or to expose different API routes and protocols.
Full source code can be found in the Community GitHub here.