Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm

Gu, Bin; Liu, Guodong; Zhang, Yanfu; Geng, Xiang; Huang, Heng

Abstract:Modern machine learning algorithms usually involve tuning multiple (from one to thousands) hyperparameters which play a pivotal role in terms of model generalizability. Black-box optimization and gradient-based algorithms are two dominant approaches to hyperparameter optimization while they have totally distinct advantages. How to design a new hyperparameter optimization technique inheriting all benefits from both approaches is still an open problem. To address this challenging problem, in this paper, we propose a new hyperparameter optimization method with zeroth-order hyper-gradients (HOZOG). Specifically, we first exactly formulate hyperparameter optimization as an A-based constrained optimization problem, where A is a black-box optimization algorithm (such as deep neural network). Then, we use the average zeroth-order hyper-gradients to update hyperparameters. We provide the feasibility analysis of using HOZOG to achieve hyperparameter optimization. Finally, the experimental results on three representative hyperparameter (the size is from 1 to 1250) optimization tasks demonstrate the benefits of HOZOG in terms of simplicity, scalability, flexibility, effectiveness and efficiency compared with the state-of-the-art hyperparameter optimization methods.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2102.09026 [cs.LG]
	(or arXiv:2102.09026v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2102.09026

Computer Science > Machine Learning

Title:Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators