cancel
Showing results for 
Search instead for 
Did you mean: 

Prediction Explanations as 'Nan'

Prediction Explanations as 'Nan'

I have downloaded the jar with Prediction explanations turned on. I have tested out 100k rows and prediction explanation was null for 78k rows. Screenshots attached below. The score is being generated for all records but not the case with prediction explanations.

Any idea how to resolve this?

 

SaiP_1-1715287383180.png

 

SaiP_0-1715287330632.png

 

Labels (1)
0 Kudos
1 Solution

Accepted Solutions
MarkR
Data Scientist
Data Scientist

Hi SaiP. I think the prediction explanations algorithm is applying thresholds to your rows. For a given pair (threshold_low, threshold_high), you will get prediction explanations only for rows where the prediction value is outside this interval. The assumption is that explanations are most useful for the "extreme" predictions, and less useful for predictions in the middle.

 

If you initialized the prediction explanations with the default thresholds, then they are based on quantiles of your data. You are seeing 78% with no explanations, 22% with explanations, which sounds to me like it's close to the defaults.

 

To get explanations for more rows using your existing JAR file, you can pass explicit thresholds. If you want absolutely all rows to be explained, you can do this:

 

result = model.predict(dat_samp, max_explanations=10, threshold_low=0.5, threshold_high=0.5)

This should give you explanations for any rows that are either lower than 0.5, or greater than 0.5. It might miss rows that are exactly 0.5. If that happens, you could pick a different number than 0.5 that isn't exactly equal to a prediction.

 

If you don't want to pass explicit thresholds, I think you could re-initialize the prediction explanations, with the thresholds turned off. Then download the JAR again, and it should have no thresholds applied by default.

 

API reference for datarobot-predict, showing how to override the thresholds in the Jar: https://datarobot.github.io/datarobot-predict/1.8/api_ref/#datarobot_predict.deployment.predict

 

UI docs showing where the thresholds were originally set: https://docs.datarobot.com/en/docs/modeling/analyze-models/understand/pred-explain/xemp-pe.html#chan... 

View solution in original post

3 Replies
MarkR
Data Scientist
Data Scientist

Hi SaiP. I think the prediction explanations algorithm is applying thresholds to your rows. For a given pair (threshold_low, threshold_high), you will get prediction explanations only for rows where the prediction value is outside this interval. The assumption is that explanations are most useful for the "extreme" predictions, and less useful for predictions in the middle.

 

If you initialized the prediction explanations with the default thresholds, then they are based on quantiles of your data. You are seeing 78% with no explanations, 22% with explanations, which sounds to me like it's close to the defaults.

 

To get explanations for more rows using your existing JAR file, you can pass explicit thresholds. If you want absolutely all rows to be explained, you can do this:

 

result = model.predict(dat_samp, max_explanations=10, threshold_low=0.5, threshold_high=0.5)

This should give you explanations for any rows that are either lower than 0.5, or greater than 0.5. It might miss rows that are exactly 0.5. If that happens, you could pick a different number than 0.5 that isn't exactly equal to a prediction.

 

If you don't want to pass explicit thresholds, I think you could re-initialize the prediction explanations, with the thresholds turned off. Then download the JAR again, and it should have no thresholds applied by default.

 

API reference for datarobot-predict, showing how to override the thresholds in the Jar: https://datarobot.github.io/datarobot-predict/1.8/api_ref/#datarobot_predict.deployment.predict

 

UI docs showing where the thresholds were originally set: https://docs.datarobot.com/en/docs/modeling/analyze-models/understand/pred-explain/xemp-pe.html#chan... 

Hi Mark, 

Thanks for the reply. I have tested both the methods as you mentioned.
Passing explicit low and high thresholds works (got to know about an additional capability of the predict function). 
Redownloading the jar after removing default thresholds under Understand>Prediction Explanations does not work. Any idea how to resolve this. I am asking because few months back I had downloaded a jar and it gives prediction explanations irrespective of the thresholds but the new jar does not (after unselecting thresholds and redownloading as well). Want to know what exactly changed as we would be deploying the jar to production and having this understanding will be helpful. 

 

Thanks 

0 Kudos
MarkR
Data Scientist
Data Scientist

SaiP, I was mistaken. I tried my suggestion in a test project, and found the behavior you described. The threshold settings are locked in the first time I download the scoring code for that model. Later changes are not applied to future download requests.

For models in the future, you can ensure that the thresholds are set as desired before you create the scoring code for the first time. But for this model file that you already downloaded once as scoring code, your only option is passing the explicit thresholds.

0 Kudos