8
0

U(1) Symmetry-breaking Observed in Generic CNN Bottleneck Layers

Abstract

We report on a novel model linking deep convolutional neural networks (CNN) to biological vision and fundamental particle physics. Information propagation in a CNN is modeled via an analogy to an optical system, where information is concentrated near a bottleneck where the 2D spatial resolution collapses about a focal point 1×1=11\times 1=1. A 3D space (x,y,t)(x,y,t) is defined by (x,y)(x,y) coordinates in the image plane and CNN layer tt, where a principal ray (0,0,t)(0,0,t) runs in the direction of information propagation through both the optical axis and the image center pixel located at (x,y)=(0,0)(x,y)=(0,0), about which the sharpest possible spatial focus is limited to a circle of confusion in the image plane. Our novel insight is to model the principal optical ray (0,0,t)(0,0,t) as geometrically equivalent to the medial vector in the positive orthant I(x,y)RN+I(x,y) \in R^{N+} of a NN-channel activation space, e.g. along the greyscale (or luminance) vector (t,t,t)(t,t,t) in RGBRGB colour space. Information is thus concentrated into an energy potential E(x,y,t)=I(x,y,t)2E(x,y,t)=\|I(x,y,t)\|^2, which, particularly for bottleneck layers tt of generic CNNs, is highly concentrated and symmetric about the spatial origin (0,0,t)(0,0,t) and exhibits the well-known "Sombrero" potential of the boson particle. This symmetry is broken in classification, where bottleneck layers of generic pre-trained CNN models exhibit a consistent class-specific bias towards an angle θU(1)\theta \in U(1) defined simultaneously in the image plane and in activation feature space. Initial observations validate our hypothesis from generic pre-trained CNN activation maps and a bare-bones memory-based classification scheme, with no training or tuning. Training from scratch using combined one-hot +U(1)+ U(1) loss improves classification for all tasks tested including ImageNet.

View on arXiv
Comments on this paper