We can significantly reduce the number of features in our dataset by leveraging DataRobot's ability to train hundreds of high-quality ML models in a matter of minutes.
Feature Importance Rank Ensembling (FIRE) aggregates the rankings of individual features using Feature Impact from several blueprints on the leaderboard. This approach can provide greater accuracy and robustness over other feature reduction methods.
About this Accelerator
This accelerator shows how to apply FIRE to your dataset and dramatically reduce the number of features without impacting the performance of the final model.
What you will learn
Calculate the permutation feature importance for the top five performing models in the Leaderboard against the selected metric.
For each model with computed feature importance, get the ranking of the features.
Compute the median rank of each feature by aggregating the ranks of the features across all models.
Sort the aggregated list by the computed median rank.
Define the threshold number of features to select. In this case, use the number of features that account for 95% of the cumulative feature impact.
Create a feature list based on the newly selected features.