cancel
Showing results for 
Search instead for 
Did you mean: 

Can't run Prediction Explanation with Recommend model

Can't run Prediction Explanation with Recommend model

Hello,

Thank you for a great product!

 

I have one question about Datarobot platform.

Did Datarobot change how Prediction Explanations work?

Background: Today 2023/01/05, after created a regression model with Date/time partitioning using Autopilot mode, I try to run Prediction Explanation with the recommended model, by using this code:

# Compute predictions
predict_job = model.request_predictions(dataset.id)
predict_job.wait_for_completion()
# Initialize prediction explanations
pei_job = dr.PredictionExplanationsInitialization.create(project.id, model.id)
pei_job.wait_for_completion()
# Compute prediction explanations with default parameters
pe_job = dr.PredictionExplanations.create(project.id, model.id, dataset.id)
pe = pe_job.get_result_when_complete()
# Iterate through predictions with prediction explanations
for row in pe.get_rows():
    print(row.prediction)
    print(row.prediction_explanations)
# download to a CSV file
pe.download_to_csv('prediction_explanations.csv')

but this error occurred:

ClientError: 422 client error: {'message': 'Prediction explanations are not supported for models that use validation data for training.'}

The command cause error is:

pei_job = dr.PredictionExplanationsInitialization.create(project.id, model.id)

From the error message, I think the problem is validation data is included on recommend model, but this error has never happened before, and the last time is used without any problem is 2022/12/26. So I think there must be some change inside Datarobot.

I even worst, when I try to deploy that recommend model and run prediction with this Batch prediction code:

job = dr.BatchPredictionJob.score(
deployment=DEPLOYMENT_ID,
intake_settings={
'type': 'localFile',
'file': input_file,
},
output_settings={
'type': 'localFile',
'path': output_file,
},
max_explanations=3
)

The new error occurred and say that I can't do that because the prediction explanations is not initialized.

Is this intentional or a bug? And how can I fix it?

 

Some more information about my environment:

- Python client version: 2.24.0

- Here is my new model when I check deploy tab in GUI

khoahv_0-1672925867578.png

 

Labels (5)
0 Kudos
1 Solution

Accepted Solutions
jenD
DataRobot Employee
DataRobot Employee

Hello and good news! The fix you are waiting for has been merged, and in a stroke of good luck, the deployment is happening a day early, so will be available on Tuesday. Thanks for bringing it to our attention and for your patience.

View solution in original post

12 Replies

Hi, Khoa!

 

I guess this is standard behavior of DataRobot, Time Series projects don't have their metrics calculated and prediction explanation calculated when they are retrained on 100% of dataset. You can use model with validation available instead to produce prediction explanations.

In case if you've experienced different behavior, may you show example of 100% data trained TS model with prediction explanations (screenshot for example from different project)?

Hi, Bogdan,

Thanks for your fast reply!

I agreed with "don't have their metrics calculated" point too, but I don't think "don't have prediction explanation calculated" is standard behavior. Here are some logs from models I created previously, which all show that I successfully used Prediction Explanations on 100% dataset model. The newest one is in 2022/12/26

khoahv_0-1672932142043.png

Here is a model created at 2022/10/03

khoahv_1-1672932307230.png

Or a model created on 2021/12/01 and is re-run Prediction Explanations on 2022/05/13.

khoahv_2-1672932501606.png

On the other hand, here is the log of today's model. As you can see, no Prediction Explanations after the normal prediction on 01-05-2023 07:31:53, which is the time the error occured.

khoahv_3-1672932803480.png

Any ideas on how things changed like that?

0 Kudos

Thank you for your log screenshots!

I'm currently trying to verify this behavior with internal engineering team. Meanwhile, may I ask you to show what it says on Understand -> Prediction explanations tab, for those who had prediction explanations calculated?

Here is Prediction Explanations tab screenshot.

khoahv_0-1672934247346.png

0 Kudos

As I understand, this Prediction Explanations tab result is trying to help users understand more about the model by using data inside the uploaded dataset. Because this model is retrained with 100% data, it has nothing to show. But please note, what I trying to do is create Prediction Explanation with completely new data, not the data inside the uploaded dataset.

Because I cannot run PredictionExplanationsInitialization with the new model, after it is deployed I couldn't run BatchPredictionJob with explanations. Beside the screenshot I showed, I don't think this is standard behaviour because why does Datarobot tries to prepare and recommend a model which cannot be used with BatchPredictionJob? So I think the problem is somehow PredictionExplanationsInitialization is blocked.

If this change is intentional for some reason, do you know any method to initialize Explanations for the deployed model? I just want to fix this problem as soon as possible, and we have about 7 hours to find a solution.

Hi, I've came back with some information to share. This is bug indeed, we've confirmed that with engineering team, and they are working to solve it so PredictionExplanationsInitialization isn't working properly. But the fastest fix shouldn't be expected until next week (it is tied to our updates schedule). 
So the best solution I can offer now is to build new project with small holdout, train the best model there, retrain it up until holdout start, and try to deploy it with prediction explanations.

I'm along with team are sorry for your experience with such time restricted project. We've learned our mistakes and will perform better in future.

Thank for your update and appreciate engineering team for fast response.

I understand the situation, can I ask for your updated schedule next week? It is Monday or Friday?

It is hard question. Usually deploys of new version are typically Wednesday, so if they fix in the next few days and if all goes well through testing phase, PredEx fix would be next Wednesday

Thank you, so the best-case scenario will be next Wednesday(1/11), and the worst-case may be in the next update schedule. Is that right?

Right now our team uses a temporary solution to deal with this problem. But our customers have a high standard, although I think it will be kind of hard for your team, though it will be so so so much help if it is fixed by next Wednesday, please let me know beforehand whether your team can catch it up or not.