When analyzing prediction explanations generated for one of the classification models I noticed that the same feature with exactly the same value can have either positive or negative impact on overall prediction score for different observations. I am sure there is a valid reason for that type of behavior but I am wondering if someone can explain it.
For example, model feature list contains 43 features (categorical and numerical). Numerical feature named Contract_Score has the following possible values: 0, 3, 5, 7, 10 and missing. Contract_Score with value 10 has “++” impact strength for some rows, but ”--" for other rows.
Thank you in advance for your help.
Solved! Go to Solution.
It's hard to generalize. If a feature has very low score on feature impact, that is assessing the impact across the entire dataset. The feature could be affecting segments of the dataset differently.
A couple of things you can do to dig deeper is taking a look at the Feature Effects/Partial Dependence of the feature as well as the Feature Fit. I also like to look at the rules in RuleFit for that feature and see what other types of features it tended to interact with and see if there are some patterns there (similarly you can see if there are any interaction effects using the GA2M models).
I typically focus on the top 3 or so explanations in prediction explanations. (Which is the case here). Once you get a bit farther, it can be really noisy.
Another option is to use the Prediction App or just do some simple simulations yourself where you slightly change the values and see what happens to the predictions. Looking for these nuanced effects is not always easy to highlight and identify with confidence.
My initial guess is there is some interaction effect within the model that is going on. (Even though there is a low correlation to the target doesn't rule that out).
Let me know how it goes
The example you provided with the size of outdoor space makes perfect sense, but I am still struggling to apply the same logic to my use case.
Let me ask you more general questions. The feature in my model (the example I provided) with both positive and negative impacts has relatively low correlation with the target outcome. In fact, it’s relative importance for the model calculated based on the entire training dataset is less than 5%. Is it safe to assume that less important features are more likely to have larger fluctuation in impact strength for the same value between different rows?
The explanations out of DataRobot are explaining an individual prediction. In some cases, the same variable can affect different rows of data in different ways, for example, because of interactions.
Consider a variable for the square footage of outdoor space. You can see for some houses in the suburbs/country, it might vary from slightly positive to slightly negative. In contrast, in a dense city, it might be highly positive. (Probably not the greatest example, but does this help?)