Grid and random search
Grid search and random search are two common methods for hyperparameter tuning. We covered random search in the previous section. In this section, we implement grid search and a more advanced version of random search.
- Add the import and set up the grid search parameters:
import itertools def grid_search(): hp_grid = { "num_layers": [6, 12, 24], "hidden_size": [512, 768, 1024], "num_heads": [8, 12, 16], "ff_dim": [2048, 3072, 4096] } best_eval_loss = float('inf') best_hp = None
- Train the model with the defined hyperparameters:
for hp in itertools.product(*hp_grid.values()): ...