In machine learning, GridSearch is a tool that is used to find a model that performs well by finding a good set of hyperparameter values. But what does this mean in an intuitive, easytounderstand way?
Let’s start with an analogy: Let’s say that you’re baking cookies and you want them to taste as good as they possibly can. To keep it simple, let’s say you have exactly two ingredients: flour and sugar. (Realistically, you need more ingredients but just go with me for now.)
How much flour do you add? How much sugar do you add? Maybe you look up recipes online, but they’re all telling you different things. There’s not some magical, perfect amount of flour and sugar that you can just look up online.
So, what do you decide to do? Your strategy is to bake many batches of cookies where each batch has a different amount of flour and sugar in it. Then, you can “taste test” each batch to see what tastes best.
To make this explicit, let’s say that you try having:
In order to see which recipe makes the best cookies, you have to test each possible combination of sugar and flour. You need to try making cookies with 1 cup of sugar and 3 cups of flour, 1 cup of sugar and 4 cups of flour, and 1 cup of sugar and 5 cups of flour, 2 cups of sugar and 3 cups of flour, and so on.
A really helpful way to organize this is to draw a grid. Draw a 3x3 grid, kind of like you’re playing the game tictactoe:
Above the first column, put 1 cup of sugar. Above the second and third columns, put 2 and 3 cups of sugar, respectively. (Depending on how you draw this, it might look like you added a fourth row here.)
To the left of the first row, put 3 cups of flour. Put 4 and 5 cups of flour to the left of the second and third rows, respectively.
1 cup of sugar 
2 cups of sugar 
3 cups of sugar 

3 cups of flour 

4 cups of flour 

5 cups of flour 
Then, fill in each of the squares of the grid with the amounts of sugar and flour corresponding to that row and column.
1 cup of sugar 
2 cups of sugar 
3 cups of sugar 

3 cups of flour 
1 cup of sugar & 3 cups of flour 
2 cups of sugar & 3 cups of flour 
3 cups of sugar & 3 cups of flour 
4 cups of flour 
1 cup of sugar & 4 cups of flour 
2 cups of sugar & 4 cups of flour 
3 cups of sugar & 4 cups of flour 
5 cups of flour 
1 cup of sugar & 5 cups of flour 
2 cups of sugar & 5 cups of flour 
3 cups of sugar & 5 cups of flour 
Notice how this looks like a grid. You are searching this grid for the best combination of sugar and flour. The only way for you to get the besttasting cookies is to bake cookies with all of these combinations, “taste test” each batch, and decide which batch is best! If you skipped some of the combinations, then it’s possible you’ll miss the besttasting cookies.
As you can see, you have 2 sets of ingredients (sugar and flour), each with 3 levels:
[3 levels of sugar] * [3 levels of flour] = 9 batches of cookies
Now, what happens when you’re in the real world and you have more than two ingredients? For example, you also have to decide how many eggs to include. Well, your “grid” now becomes a 3dimensional grid. If you decide between 2 eggs and 3 eggs, then you need to try all nine combinations of sugar & flour for 2 eggs, and you need to try all nine combinations of sugar & flour for 3 eggs:
[3 levels of sugar] * [3 levels of flour] * [2 levels of eggs] = 18 batches of cookies
Obviously, the more of the different types of ingredients you include (e.g. sugar, flour, eggs), the more combinations you have to choose. Also, the more levels of ingredients (e.g. 3 levels for sugar, 3 levels for flour, 2 levels for eggs) you include, the more combinations you have to choose.
When you build models, you have lots of choices to make. Some of these choices are called hyperparameters. For example, if you build a random forest, you need to choose things like:
We call the process of finding the best values of hyperparameters “model tuning.”
The way you test this is just like how you tastetested all of those different batches of cookies.
Just like with ingredients, the number of hyperparameters and number of levels you search are important.
From earlier, you saw that we took the number of levels of each hyperparameter that we wanted to test and multiplied those numbers together. That’s the formula for finding how many models you’re fitting via GridSearch. For example, if you want to search 5 hyperparameters, each with 4 different levels, then you’re building:
4 * 4 * 4 * 4 * 4 = 4^5 = 1,024 models
Building models can be timeconsuming! If you try too many hyperparameters and too many levels of each hyperparameter, you might get a really highperforming model but it might take a really, really, really long time to get! (You can also run the risk of something called overfitting a model to the training data.)
GridSearch is a commonlyused technique in machine learning that is used to find the best set of hyperparameters for a model.
If you were building these models on your own, you would likely have to do this process manually. However, DataRobot will do this for you!
For each of the models on the Leaderboard that contain tunable hyperparameters, DataRobot will automatically search over predetermined levels of hyperparameters for you. DataRobot is leveraging what we’ve learned by fitting more than 1 billion models — yes, that’s billion with a B! — to help you get the bestperforming model as quickly as possible.
DataRobot does this via a smart pattern search. Rather than exhaustively searching every combination of hyperparameter values, DataRobot intelligently emphasizes areas where the model is likely to do well and skips hyperparamet....
As an advanced option, DataRobot also permits you to experiment with your own hyperparameter settings with manual hyperparameter tuning. You can learn how to manually tune your hyperparameters in DataRobot here.
No! RandomizedSearch is an alternative to GridSearch, with the same goal: RandomizedSearch searches multiple values of hyperparameters to identify a highperforming model.
The difference between GridSearch and RandomizedSearch is the grid. With GridSearch, you specified which levels of hyperparameters you wanted to check, then you checked every possible combination. A grid is a helpful way of visualizing every combination of hyperparameter values that you are going to test.
RandomizedSearch is different. With RandomizedSearch, rather than setting up a grid to check, you might simply specify a range of each hyperparameter. For example, somewhere between 1 and 3 cups of sugar, somewhere between 3 and 5 cups of flour. Then, a computer will randomly generate, say, 5 combinations of sugar and flour to test. One example is:
RandomizedSearch would then “taste test” each batch and return the cookies that taste the best — or the model that performs the best! There are advantages and disadvantages to each approach. You can learn more about the differences between RandomizedSearch and GridSearch here.