556

The Power of Depth for Feedforward Neural Networks

Abstract

We show that there are simple functions expressible by small 3-layer feedforward neural networks, which cannot be approximated by a 2-layer network, to more than a certain constant accuracy, unless its width is exponential in the dimension. The result holds for most continuous activation functions, such as rectified linear units and sigmoids, and formally demonstrates that depth -- even if increased by 1 -- can be exponentially more valuable than width for standard feedforward neural networks.

View on arXiv
Comments on this paper