We demonstrate that a very deep ResNet with stacked modules with one neuron per hidden layer and ReLU activation functions can uniformly approximate any Lebesgue integrable function in dimensions, i.e. . Because of the identity mapping inherent to ResNets, our network has alternating layers of dimension one and . This stands in sharp contrast to fully connected networks, which are not universal approximators if their width is the input dimension [Lu et al, 2017; Hanin and Sellke, 2017]. Hence, our result implies an increase in representational power for narrow deep networks by the ResNet architecture.
View on arXiv