In DataRobot I see some blueprints use ‘Ridit Transformation’ for the numeric features. How does this transformation work?
I’m planning to implement a coefficient-based model to a low latency environment that is isolated from the DataRobot environment. In order to operationalize the model, I need to replicate the feature engineering steps and apply the DataRobot coefficient estimation to get the predicted probability of the positive event. If I want to perform Ridit transformation in my own data preparation pipeline, how would I do that?
DataRobot has its own implementation of Ridit transformation, so the you can’t get exactly the same result if you want to transform features outside DataRobot. Good news is, you can use the scikit-learn modules below to get something very similar:
Below is an illustration of how to mimic DataRobot’s implementation of Ridit transformation (100 quantiles between [-1,1]) in a binary classification project and the test result of the difference between predicted probabilities(In this example, by applying the same coefficients to the manually Ridit transformed feature in holdout set, we can get very similar predictions compared against DataRobot predictions)