Basic Level Categorizatiandon is Important: Modelling the Facilitation
in Visual Object Recognition
Recent advances in deep learning have led to significant progress in computer vision field, especially for visual object recognition task. The features useful for object classification are learned by feed-forward deep convolutional neural networks (CNNs) automatically, and they are shown to be able to probe representations in neural data of ventral visual pathway. However, although tremendous studies on optimizing CNNs are introduced, few is focused on linking with guiding principles of the human visual cortex. In this work, we propose a network optimization strategy inspired by Bar (2003), with the hypothesis that performing basic-level object categorization task first can facilitate the subordinate-level categorization task. The basic-level information carried in the "fast" magnocellular pathway through prefrontal cortex (PFC) is projected back to interior temporal cortex (IT), where subordinate-level categorization is achieved. By adopting this principle into training our deep networks on ILSVRC 2012 dataset using AlexNet (Krizhevsky et al., 2012), we show that the top-5 accuracy increased from 80.13% to 81.48%, demonstrating the effectiveness of the method. The fine-tuning result shows that the learned feature has stronger generalization power.
View on arXiv