When getting predictions from a Datarobot project via the Python API, it is possible to kick off the predictions and terminate and then later find the predictions in a list - and select the one you wanted, as long as you know enough about what things are called - without knowing specific IDs. But, when getting prediction explanations there does not seem to be any way to terminate the Python script and then pick up later to check whether it has finished and then get the explanations.
The best option, I could see is to store the predictionexplanation job ID somewhere, perhaps a disk file, and read it back when restarting the script. But, that seems to not work if the job actually terminates in the meantime. [After the fact, there are no jobs in the project job list etc].
I want a method so that I can kick off a prediction explanation job and then terminate the Python script, and just check every now and then to see whether it has finished. Part of the reason for this is so that I can actually shut down the Python script host between runs.
Solved! Go to Solution.
I got past my blockage using the information from both @dustin.burke and @Lukas. However, I cannot offer any neat post mortem, as I had to just generally juggle the code after gaining the required inspiration.
In general, you want to use the API methods which support an async (non-blocking) workflow. When looking at the Python API documentation, look for methods which return a Job object.
Most API methods support this by creating a Job that is run asynchronously within the app, and then you can use the Job's methods to check on the status, wait_for_completion and get_result. See the Job tutorial for examples.
If your script needs to be interruptible, you'll want to save the state of the Job so that it can be looked up later. The Python API methods return a Job object which could be stored using python shelve (although I haven't tested it). Alternatively, you could use Project.get_all_jobs to look them up afterwards and not need to store any state locally, but instead your script would need to loop through the jobs to identify which one to proceed with.
For requesting prediction explanations, there are 4 steps involved. Each step will involve making an async API request that has some intermediate Job tracking it:
Hopefully that helps. Let us know if you need more specific troubleshooting or guidance to get this working.
Or maybe I misunderstood the question.
If it's about knowing if the job has finished, you could do the same, but then checking if that list for this model id etc is empty.
@Lukas I already know and use all that - so I guess I asked the wrong question somehow. I will get back to this.
Hi @Bruce ,
you can use
to get all available prediction explanations. This list you can then filter doing sth like this:
p = dr.Project.get(...) m = dr.Model.get(...) ds = dr.Dataset.get(...) pe = [pe for pe in dr.PredictionExplanations.list(p.id) if pe.project_id == p.id and pe.model_id == m.id and pe.dataset_id == ds.id].pop()
And to preempt a possible question: datasets also have a '.list' method.
Hope that helps.