ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.04048
11
21

Sharp Representation Theorems for ReLU Networks with Precise Dependence on Depth

7 June 2020
Guy Bresler
Dheeraj M. Nagaraj
ArXivPDFHTML
Abstract

We prove sharp dimension-free representation results for neural networks with DDD ReLU layers under square loss for a class of functions GD\mathcal{G}_DGD​ defined in the paper. These results capture the precise benefits of depth in the following sense: 1. The rates for representing the class of functions GD\mathcal{G}_DGD​ via DDD ReLU layers is sharp up to constants, as shown by matching lower bounds. 2. For each DDD, GD⊆GD+1\mathcal{G}_{D} \subseteq \mathcal{G}_{D+1}GD​⊆GD+1​ and as DDD grows the class of functions GD\mathcal{G}_{D}GD​ contains progressively less smooth functions. 3. If D′<DD^{\prime} < DD′<D, then the approximation rate for the class GD\mathcal{G}_DGD​ achieved by depth D′D^{\prime}D′ networks is strictly worse than that achieved by depth DDD networks. This constitutes a fine-grained characterization of the representation power of feedforward networks of arbitrary depth DDD and number of neurons NNN, in contrast to existing representation results which either require DDD growing quickly with NNN or assume that the function being represented is highly smooth. In the latter case similar rates can be obtained with a single nonlinear layer. Our results confirm the prevailing hypothesis that deeper networks are better at representing less smooth functions, and indeed, the main technical novelty is to fully exploit the fact that deep networks can produce highly oscillatory functions with few activation functions.

View on arXiv
Comments on this paper