678
v1v2v3v4v5v6 (latest)

Optimal Approximation Rates for Deep ReLU Neural Networks on Sobolev and Besov Spaces

Journal of machine learning research (JMLR), 2022
Abstract

Let Ω=[0,1]d\Omega = [0,1]^d be the unit cube in Rd\mathbb{R}^d. We study the problem of how efficiently, in terms of the number of parameters, deep neural networks with the ReLU activation function can approximate functions in the Sobolev spaces Ws(Lq(Ω))W^s(L_q(\Omega)) and Besov spaces Brs(Lq(Ω))B^s_r(L_q(\Omega)), with error measured in the Lp(Ω)L_p(\Omega) norm. This problem is important when studying the application of neural networks in a variety of fields, including scientific computing and signal processing, and has previously been solved only when p=q=p=q=\infty. Our contribution is to provide a complete solution for all 1p,q1\leq p,q\leq \infty and s>0s > 0 for which the corresponding Sobolev or Besov space compactly embeds into LpL_p. The key technical tool is a novel bit-extraction technique which gives an optimal encoding of sparse vectors. This enables us to obtain sharp upper bounds in the non-linear regime where p>qp > q. We also provide a novel method for deriving LpL_p-approximation lower bounds based upon VC-dimension when p<p < \infty. Our results show that very deep ReLU networks significantly outperform classical methods of approximation in terms of the number of parameters, but that this comes at the cost of parameters which are not encodable.

View on arXiv
Comments on this paper