336

Depth Separation in ReLU Networks for Approximating Smooth Non-Linear Functions

Abstract

We provide a depth-based separation result for feed-forward ReLU neural networks, showing that a wide family of non-linear, twice-differentiable functions on [0,1]d[0,1]^d, which can be approximated to accuracy ϵ\epsilon by ReLU networks of depth and width O(poly(log(1/ϵ)))\mathcal{O}(\text{poly}(\log(1/\epsilon))), cannot be approximated to similar accuracy by constant-depth ReLU networks, unless their width is at least Ω(1/ϵ)\Omega(1/\epsilon).

View on arXiv
Comments on this paper