Turning Raw Predictions into Decisions with an API Wrapper

cancel
Showing results for 
Search instead for 
Did you mean: 

Turning Raw Predictions into Decisions with an API Wrapper

Are you dealing with stacked predictions or pipelines where you have to score 10+ different models and consolidate results based on a predefined business logic? Or, are you dealing with frequently changing business logic?

These are just a few of the scenarios where a simple but versatile API wrapper can help with turning raw predictions into actual decisions. This tutorial explains how you can implement a "decision engine" using an API wrapper. As you'll see, the process is quite straightforward:

  1. Create and deploy a model.
  2. Install the Decision Engine.
  3. Configure the Decision Engine.
  4. Generate decisions instead of just predictions.

Prerequisites

For this tutorial it is assumed you have a basic knowledge of Docker, Python, and Django.

We will deploy our Docker image locally on a Linux server for this tutorial, but you could just as easily deploy it to AWS EKS or Azure AKS or, alternatively, take the same logic and implement it with a serverless architecture such as AWS Lambda or Azure Functions.

You can download Docker for your OS from here:
https://hub.docker.com/editions/community/docker-ce-server-centos and follow the install instructions for your respective Linux distribution (e.g., for CentOS you can follow the instructions here: https://docs.docker.com/engine/install/centos/).

To install the latest version of Docker Engine and container, run the following command.

$ sudo yum install docker-ce docker-ce-cli containerd.io

Create and deploy a model

In this tutorial we will create a model with DataRobot AutoML and subsequently deploy it to a DataRobot prediction server.

In this particular use case, we will use our public Lending Club dataset (10K_Lending_Club_Loans.csv) to predict the likelihood of default. You can download the dataset from here:
https://s3.amazonaws.com/datarobot_public_datasets/10K_Lending_Club_Loans.csv.

Because we are using DataRobot AutoML, our model is just a few clicks away.

  1. Drop the training data into DataRobot.

    Select URL import, paste in the above URL, and click Create New Project as shown below.

    lhaviland_2-1603824312030.png

    lhaviland_1-1603824297115.png
  2. Select the target and click Start.

    Specify the is_bad column as target, and then click Start to kick off Quick Autopilot modeling mode.

    lhaviland_0-1603824418964.png
  3. Deploy the recommended model to DataRobot prediction server.

    Once Quick Autopilot is complete, switch to the Leaderboard and select any model from the top of the Leaderboard.

    lhaviland_1-1603824476350.png
    lhaviland_2-1603824486809.png
    Now, navigate to the
    Predict > Deploy tab, click Deploy model. In the displayed deployment overview page give the deployment a name, enable drift tracking, and then click Create deployment.
    lhaviland_3-1603824509200.png
  4. Navigate to Predictions > Prediction API tab. Take note of the deployment ID along with the API token (API_KEY), datarobot key, and prediction server URL (API_URL).
    lhaviland_0-1603824650190.png


    We will reference these credentials in the API wrapper later.

Install the Decision Engine

Now that we have deployed the model, you can pull the Docker image containing the API wrapper. Use the following command to install the Decision Engine (i.e., API wrapper):

 

docker run -it -p 8000:8000 \
    -e DJANGO_SUPERUSER_USERNAME=<USERNAME> \
    -e DJANGO_SUPERUSER_PASSWORD=<PASSWORD> \
    -e DJANGO_SUPERUSER_EMAIL=<EMAIL> \
    felix85/datarobot_decisions_engine

 


Important: Before running the above command for the first time, please replace the <username>, <password>, and <email> with your respective credentials.

Also, if you wanted to keep this service running, even when your console is closed, you would instead use the command shown below:

 

docker run  -p 8000:8000 \
    -e DJANGO_SUPERUSER_USERNAME=<USERNAME> \
    -e DJANGO_SUPERUSER_PASSWORD=<PASSWORD> \
    -e DJANGO_SUPERUSER_EMAIL=<EMAIL> \
    felix85/datarobot_decisions_engine

 

Configure the Decision Engine

Before you can use the Decision Engine, you need to complete the configuration. To do so, open a browser of your choice and navigate to http://127.0.0.1:8000.

You will see the DataRobot Decisions GUI.

lhaviland_0-1603824865743.png


Enter the previously specified username and password (from Install the Decision Engine) and click Log in.

In the displayed DataRobot Decisions - Decision Engine admin page, finalize the configuration as explained below. 

  1. Set up a logical entity as an abstraction layer.

    lhaviland_1-1603824930307.png
    Specify a name, for example ‘LoanA’.

  2. Add pre- and post-processing business logic and link it to the previously created logical abstraction layer.
    lhaviland_0-1603825490735.png lhaviland_1-1603825507406.png


    Specify the name, for example “LoanALogic,” and paste the Python sample code shown below.

    # -*- coding: utf-8 -*-
    """
    Created on 2020/07/09
    
    @author: Felix Huthmacher
    
    """
    
    import pandas as pd
    import datetime
    
    ## data preparation / pre-processing business logic
    def data_prepare(features_df):
    # e.g. enhance web service-input with some other features
    features_df = features_df
    features_df['FICO'] = '850'
    
    # Specify deployment id / model for scoring based on a certain feature/business entity
    # you could specify a different deployment for each row in the dataframe
    
    features_df['deployment_id'] = features_df['sub_grade'].apply(lambda x: <REPLACE WITH YOUR DEPLOYMENT_ID> if x == 'A2' else '<REPLACE WITH YOUR DEPLOYMENT_ID>')
    
    return features_df
    
    ## post-processing business logic
    def business_logic(df_new):
    # e.g. different thresholds based on geographic area
    df_new['decision'] = df_new['addr_state'].apply(lambda x: '1' if x == 'CA' else '0')
    
    
    return df_new['decision'], df_new


    Make sure to update the code snippet with the corresponding deployment ID that you created in step 4 (Create and Deploy a Model).

    The above sample code includes two methods:

    - data_prepare(features_df)
    - business_logic(df_new)

    The method data_prepare allows you to add pre-processing steps such as data enrichment, feature engineering, or duplicated input rows for scoring against multiple models in parallel.

    Each row in the dataframe can be pointed to a different deployment ID / model for scoring based on bespoke business logic.

    The method business_logic allows you to consolidate scoring results based on predefined business logic. For example you can define different probability thresholds based on geographic data /customer segments, or roll up results from multiple models before returning results. This way you can return decisions rather than just raw prediction results, which simplifies integrations with downstream systems.

  3. Specify the prediction server instance

    The last step is to specify the prediction server instance and credentials that we want to use for our predictions.

    For this we click Change and then specify the name, server URL, Datarobot Key (only required for a DataRobot Managed AI Cloud deployment), username, and API token, as well as the default logic connector. (All connection details and credentials can be found in the sample code from step 4 Create and Deploy a Model.)

    Click Save when done.

    lhaviland_0-1603825919530.png
    lhaviland_1-1603825929344.png

Generate decisions instead of just predictions

Now that we have completed the configuration, we can use our Decision Engine. You can download the Postman collection that includes a sample REST and SOAP request from here.

By default the Decision Engine supports basic authentication.

lhaviland_2-1603825983619.png

The username and password can be configured in the Django settings.py here as shown below.

lhaviland_3-1603825997878.png


DataRobot natively supports a REST API, but you can easily convert this Decision Engine (i.e., API Wrapper) to a SOAP API as shown below. Input and Output structure can be adjusted as needed.

Example SOAP requestExample SOAP request


Example REST requestExample REST request

Final thoughts

Now that we have created our Decision Engine (i.e., API wrapper), we can turn our raw predictions into actionable decisions. Additionally. We can encapsulate business logic and put governance around it through logging and versioning. Security is important, thus not everyone can change the business logic and altering business logic or generating decisions requires authentication. Because security is important, the decision engine has restrictions for changing business logic and altering business logic, and requires authentication to generate decisions.

Every change can be logged, and as soon as a particular business logic has been used to generate decisions, it cannot be altered; instead, users have to create new versions.

If you have followed any of my previous tutorials when you already know what is coming next.

Because we are leveraging the DataRobot Prediction API, we automatically benefit from its built-in monitoring functionality; this enables us to monitor a model’s performance and benchmark it against other models. Also it allows us to replace the model at any point in time without having to write or change a single line of code.

lhaviland_0-1603826316124.png

Finally, this sample code can also easily be modified to work with different different scoring methods such as portable prediction servers and scoring code, or to expose different API routes and protocols.

Full source code can be found in the Community GitHub here.

Labels (3)
Version history
Revision #:
12 of 12
Last update:
3 weeks ago
Updated by: