XEMP and top 50 features for large size feature sets
Apparently to calculate the PE , XEMP is selecting the top 50 features for its synthetic data . We consider a use case that the model has more than 100 impactful features that clearly have had impact on the score. Does this mean that the result of XEMP is note reliable?
How does the feature impact with 100 features look like? Do they all evenly contribute to the model or the top 50 explain the majority? In most cases, the prediction explanations (XEMP up to 10) returned by DataRobot should be reliable. If you do want to explore the 100 features, you may try SHAP prediction explanations by enabling SHAP model in Advanced Options before you start the model training process.