Hyper-parameter optimization of Deep Convolutional Networks for object recognition

30 January 2015

Abstract

Recently sequential model based optimization (SMBO) has emerged as a promising hyper-parameter optimization strategy in machine learning. In this work, we investigate SMBO to identify architecture hyper-parameters of deep convolution networks (DCNs) object recognition. We propose a simple SMBO strategy that starts from a set of random initial DCN architectures to generate new architectures, which on training perform well on a given dataset. Using the proposed SMBO strategy we are able to identify a number of DCN architectures that produce results that are comparable to state-of-the-art results on object recognition benchmarks. Specifically, we report three DCN networks generated by our proposed algorithm that produce <9% test error rate, with the best network exhibiting a test error rate of 7.81% on the CIFAR-10 benchmark. Our results compare favorably to the current state-of-the-art of 7.97 % test error rate for CIFAR-10 that are obtained by hand tuning.

View on arXiv

Comments on this paper