ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1702.08489
9
74

Depth Separation for Neural Networks

27 February 2017
Amit Daniely
    MDE
ArXivPDFHTML
Abstract

Let f:Sd−1×Sd−1→Sf:\mathbb{S}^{d-1}\times \mathbb{S}^{d-1}\to\mathbb{S}f:Sd−1×Sd−1→S be a function of the form f(x,x′)=g(⟨x,x′⟩)f(\mathbf{x},\mathbf{x}') = g(\langle\mathbf{x},\mathbf{x}'\rangle)f(x,x′)=g(⟨x,x′⟩) for g:[−1,1]→Rg:[-1,1]\to \mathbb{R}g:[−1,1]→R. We give a simple proof that shows that poly-size depth two neural networks with (exponentially) bounded weights cannot approximate fff whenever ggg cannot be approximated by a low degree polynomial. Moreover, for many ggg's, such as g(x)=sin⁡(πd3x)g(x)=\sin(\pi d^3x)g(x)=sin(πd3x), the number of neurons must be 2Ω(dlog⁡(d))2^{\Omega\left(d\log(d)\right)}2Ω(dlog(d)). Furthermore, the result holds w.r.t.\ the uniform distribution on Sd−1×Sd−1\mathbb{S}^{d-1}\times \mathbb{S}^{d-1}Sd−1×Sd−1. As many functions of the above form can be well approximated by poly-size depth three networks with poly-bounded weights, this establishes a separation between depth two and depth three networks w.r.t.\ the uniform distribution on Sd−1×Sd−1\mathbb{S}^{d-1}\times \mathbb{S}^{d-1}Sd−1×Sd−1.

View on arXiv
Comments on this paper