cancel
Showing results for 
Search instead for 
Did you mean: 

FIRE for advanced feature selection

FIRE for advanced feature selection

I have read this article about the FIRE method for advanced feature selection:  https://www.datarobot.com/blog/using-feature-importance-rank-ensembling-fire-for-advanced-feature-se.... Is this method suitable to use or is there any suggestion to further shortlist or reduce the number of features for modeling?

Labels (1)
0 Kudos
2 Replies

Yes you can use that method for aggregating features and you can try to use different methods that are available in DataRobot for feature impact. The methods are:

  • Permutation-based shows how much the error of a model would increase, based on a sample of the training data, if values in the column are shuffled.
  • SHAP-based shows how much, on average, each feature affects training data prediction values. For supervised projects, SHAP is available for AutoML projects only. See also the SHAP reference.
  • Tree-based variable importance uses node impurity measures (gini, entropy) to show how much gain each feature adds to the model.

More information about those methods, you can read it here.

0 Kudos

Yes, you can use classic approaches after running FIRE.

 

For example, within your selected model you can look at Feature Impact and select the top N features, excluding deprecated features as an option. Once this is done, you can rerun the model, or the entire Autopilot, to see what the degradation of the model performance will be. You would expect to see some, but not a lot, of performance reduction, usually (well) under 1% or so, on the other hand you gain a simpler and potentially more stable model on which you can run predictions on much faster.  

0 Kudos