important variables?

important variables?

Our dataset has so many variables and we want to try to remove some, but aren’t sure which ones are important. Can you help?

Appreciate your help

Labels (2)
2 Solutions

Accepted Solutions

Hey @rick-wheller ,

Data Robot will automatically create a reduced feature list "based on the Feature Impact calculation of the best non-blender model in the Leaderboard". You can use this generated feature list to determine which features should be kept and which should be removed from your training set.

See the "DR Reduced Features" section in https://app.datarobot.com/docs/modeling/curate/feature-lists.html

If you have some more time you can read this quick article on PCA analysis for feature reduction: https://towardsdatascience.com/principal-component-analysis-intro-61f236064b38

View solution in original post

Great answer from @stephen_p !

 

The only thing I would add is that certain blueprints will automatically run PCA & clustering techniques to reduce the dimensionality of your dataset. 

This is a great question to ask your assigned account team, particularly your CFDS. There are a number of deeper tips / tricks we can provide given more context. 

 

Cheer,

Duncan

View solution in original post

2 Replies

Hey @rick-wheller ,

Data Robot will automatically create a reduced feature list "based on the Feature Impact calculation of the best non-blender model in the Leaderboard". You can use this generated feature list to determine which features should be kept and which should be removed from your training set.

See the "DR Reduced Features" section in https://app.datarobot.com/docs/modeling/curate/feature-lists.html

If you have some more time you can read this quick article on PCA analysis for feature reduction: https://towardsdatascience.com/principal-component-analysis-intro-61f236064b38

Great answer from @stephen_p !

 

The only thing I would add is that certain blueprints will automatically run PCA & clustering techniques to reduce the dimensionality of your dataset. 

This is a great question to ask your assigned account team, particularly your CFDS. There are a number of deeper tips / tricks we can provide given more context. 

 

Cheer,

Duncan