cancel
Showing results for 
Search instead for 
Did you mean: 

Why does DR not auto grid search on all hyperparameters?

jialinli
Data Scientist
Data Scientist

Why does DR not auto grid search on all hyperparameters?

Hi there,

 

Why does DR not auto grid search on all hyperparameters?

 

Thanks

Labels (1)
4 Replies
taylor_larkin
Data Scientist
Data Scientist

Hi @jialinli,

 

Great question! So we've spent the last decade coming up with heurstics on what hyperparameters to tune in a model given a particular dataset. Leveraging these heurstics allows models to finish training much faster compared to just brute force trying everything. As a user though, you're welcome to try a brute force search by leveraging "advanced tuning" in the platform (which can also be accessible via the modeling APIs).

 

Cheers,

 

Taylor

felix_r
Data Scientist
Data Scientist

Following on from Taylor's comment, to access advanced tuning in order to run your own grid search:

felix_r_2-1659038578793.png

 

Once in there, you can input parameters either as a single value or as a list (python list for example, square parentheses are optional):

felix_r_3-1659038798556.png

 

in the example above I gave multiple set of values to search through for min_sample_leaf and min_sample_split. It will search through all combinations of those (and as expected it might take a while if there are many sets of values to search through - it is building a model each time) . Some parameters, such as number of estimators in this Random Forests example, only allow a single value. (The parameters available will depend on the model itself).

All the sub-models included in the blueprint, for example the Auto-Tuned Word N-Gram Text Modeler using token occurrences, will have their own adjustable sets of parameters within the same advanced tuning screen.

Once you have all the choices set up, describe the tuning, click on 'begin tuning' and the best resulting models will be added to the leaderboard, and easily identifiable with the green text describing the tuning.

 

 

felix_r_4-1659039338370.png

 

In this case, it beat the existing model slightly but rest assured that was pure luck. To see which of the parameters selected 'won', pop back into advanced tuning of the resulting model, and look through the parameters originally proposed to see which resulted in the best performance. 

felix_r_5-1659039529665.png

 


 

 

shaz13
Data Scientist
Data Scientist

Training a indie model on all Grid search would be NP hard problem. You would have exponentially high run time O(N*P1*P2*P3*...) 

For majority of algorithms - XGBoost we can approximate on its behaviour where would be a valley of good parameters. Usually, we get this from experience and research. We have optimized heuristic based approach in DataRobot

You can also read  about Pattern search to understand more about how we can get optimized k params using optimization

 

0 Kudos
dalilaB
Data Scientist
Data Scientist

Here is a link  from our team that should help answer your question

0 Kudos