ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1312.6098
44
256

On the number of response regions of deep feed forward networks with piece-wise linear activations

20 December 2013
Razvan Pascanu
Guido Montúfar
Yoshua Bengio
    FAtt
ArXivPDFHTML
Abstract

This paper explores the complexity of deep feedforward networks with linear pre-synaptic couplings and rectified linear activations. This is a contribution to the growing body of work contrasting the representational power of deep and shallow network architectures. In particular, we offer a framework for comparing deep and shallow models that belong to the family of piecewise linear functions based on computational geometry. We look at a deep rectifier multi-layer perceptron (MLP) with linear outputs units and compare it with a single layer version of the model. In the asymptotic regime, when the number of inputs stays constant, if the shallow model has knknkn hidden units and n0n_0n0​ inputs, then the number of linear regions is O(kn0nn0)O(k^{n_0}n^{n_0})O(kn0​nn0​). For a kkk layer model with nnn hidden units on each layer it is Ω(⌊n/n0⌋k−1nn0)\Omega(\left\lfloor {n}/{n_0}\right\rfloor^{k-1}n^{n_0})Ω(⌊n/n0​⌋k−1nn0​). The number ⌊n/n0⌋k−1\left\lfloor{n}/{n_0}\right\rfloor^{k-1}⌊n/n0​⌋k−1 grows faster than kn0k^{n_0}kn0​ when nnn tends to infinity or when kkk tends to infinity and n≥2n0n \geq 2n_0n≥2n0​. Additionally, even when kkk is small, if we restrict nnn to be 2n02n_02n0​, we can show that a deep model has considerably more linear regions that a shallow one. We consider this as a first step towards understanding the complexity of these models and specifically towards providing suitable mathematical tools for future analysis.

View on arXiv
Comments on this paper