The Power of Depth for Feedforward Neural Networks
Abstract
We show that there are simple functions on , expressible by small 3-layer feedforward neural networks, which cannot be approximated by any 2-layer network, to more than a certain constant accuracy, unless its width is exponential in the dimension. The result holds for most continuous activation functions, including rectified linear units and sigmoids, and is a formal demonstration that depth -- even if increased by 1 -- can be exponentially more valuable than width for standard feedforward neural networks. Moreover, compared to related results in the context of Boolean functions, our result requires fewer assumptions, and the proof techniques and construction are very different.
View on arXivComments on this paper
