cancel
Showing results for 
Search instead for 
Did you mean: 

what is Tree-based Variable Importance?

what is Tree-based Variable Importance?

Hello

 

I have a question what is difference between Tree-based variable importance(Models>Insights>Tree-based Variable Importance) and Feature Impact(Models>Understand>Feature Impact).

 

I choose the same model(Light Gradient Boosted Trees Classifier with Early Stopping with 80.01% sample and 'new_list' Feature list) in "Insights" Tree-based Variable Importance menu.

However, the result was totally different with Feature Impact.

 

InsightsInsightsUnderstandUnderstand

 

Are these Importances calculated with different metrics?

Feature Impact -> permutation-based

Tree-based variable importance -> node impurity measures (gini, entropy)

Am I right?

 

Not in this dataset, but sometimes "Feature Effects" menu's Feature Impact part is different with "Feature Impact" menu's Feature Impact.

I cannot understand this.

 

Feature Effect.PNG

 

Labels (1)
0 Kudos
1 Solution

Accepted Solutions
Vinay
DataRobot Alumni

Hi @cookie_yamyam ,

 

Thank you for the question.

 

You were right in differentiating between Feature Impact and Tree-based Variable Importance - Feature Impact is permutatation-based and Tree-based Variable Importance is node-impurity based. Since they are calculated using different means, it is not expected that these two methods always produce the same result. Feature impact (or permutation importance) is model-agnostic and is available for all the models on the leaderboard whereas Tree-based Variable importance is only available for tree-based models.

 

Regarding your second question on Feature Effects sorted by Impact, the order of features on the left side in Feature Effects is based on Feature Impact. However, some feature types like Text are not displayed within Feature Effects because it has high cardinality. Hence, you may find fewer features in Feature Effects as compared to Feature Impact in some projects.

 

Let me know if the explanations clarify your questions.

 

Vinay

View solution in original post

1 Reply
Vinay
DataRobot Alumni

Hi @cookie_yamyam ,

 

Thank you for the question.

 

You were right in differentiating between Feature Impact and Tree-based Variable Importance - Feature Impact is permutatation-based and Tree-based Variable Importance is node-impurity based. Since they are calculated using different means, it is not expected that these two methods always produce the same result. Feature impact (or permutation importance) is model-agnostic and is available for all the models on the leaderboard whereas Tree-based Variable importance is only available for tree-based models.

 

Regarding your second question on Feature Effects sorted by Impact, the order of features on the left side in Feature Effects is based on Feature Impact. However, some feature types like Text are not displayed within Feature Effects because it has high cardinality. Hence, you may find fewer features in Feature Effects as compared to Feature Impact in some projects.

 

Let me know if the explanations clarify your questions.

 

Vinay