186

Learning to Transfer Initializations for Bayesian Hyperparameter Optimization

Abstract

Hyperparameter optimization undergoes extensive evaluations of validation errors in order to find the best configuration of hyperparameters. Bayesian optimization is now popular for hyperparameter optimization, since it reduces the number of validation error evaluations required. Suppose that we are given a collection of datasets on which hyperparameters are already tuned by either humans with domain expertise or extensive trials of cross-validation. When a model is applied to a new dataset, it is desirable to let Bayesian hyperparameter optimzation start from configurations that were successful on similar datasets. To this end, we construct a Siamese network with convolutional layers followed by bi-directional LSTM layers, to learn {\em meta-features} over datasets. Learned meta-features are used to select a few datasets that are similar to the new dataset, so that a set of configurations in similar datasets is adopted as initializations for Bayesian hyperparameter optimization. Experiments on image datasets demonstrate that our learned meta-features are useful in optimizing several hyperparameters in deep residual networks for image classification.

View on arXiv
Comments on this paper