Learning Convolutional Neural Networks using Hybrid Orthogonal Projection and Estimation

20 June 2016

Abstract

Convolutional neural networks (CNNs) have yielded the excellent performance in a variety of computer vision tasks, where CNNs typically adopt a similar structure consisting of convolution layers, pooling layers and fully connected layers. In this paper, we propose to apply a novel method, namely Hybrid Orthogonal Projection and Estimation (HOPE), to CNNs in order to introduce orthogonality into the CNN structure. The HOPE model can be viewed as a hybrid model to combine feature extraction using orthogonal linear projection with mixture models. It is an effective model to extract useful information from the original high-dimension feature vectors and meanwhile filter out irrelevant noises. In this work, we present two different ways to apply the HOPE models to CNNs, i.e., {\em HOPE-Input} and {\em HOPE-Pooling}. For {\em HOPE-Input}, a HOPE layer is directly used right after the input to de-correlate high-dimension input feature vectors. Alternatively, in {\em HOPE-Pooling}, a HOPE layer is used to replace the regular pooling layer in CNNs. The experimental results on both CIFAR-10 and CIFAR-100 data sets have shown that the orthogonal contraints imposed by the HOPE layers can significantly improve the performance of CNNs in these image classification tasks (we have achieved top-3 performance when image augmentation has not been applied).

View on arXiv

Comments on this paper