358
v1v2v3v4v5 (latest)

On the number of response regions of deep feed forward networks with piece-wise linear activations

Abstract

This paper explores the complexity of deep feedforward networks with linear pre-synaptic couplings and rectified linear activations. This is a contribution to the growing body of work contrasting the representational power of deep and shallow network architectures. In particular, we offer a framework for comparing deep and shallow models that belong to the family of piecewise linear functions based on computational geometry. We look at a deep rectifier multi-layer perceptron (MLP) with linear outputs units and compare it with a single layer version of the model. In the asymptotic regime, when the number of inputs stays constant, if the shallow model has knkn hidden units and n0n_0 inputs, then the number of linear regions is O(kn0nn0)O(k^{n_0}n^{n_0}). For a kk layer model with nn hidden units on each layer it is Ω(n/n0k1nn0)\Omega(\left\lfloor {n}/{n_0}\right\rfloor^{k-1}n^{n_0}). The number n/n0k1\left\lfloor{n}/{n_0}\right\rfloor^{k-1} grows faster than kn0k^{n_0} when nn tends to infinity or when kk tends to infinity and n2n0n \geq 2n_0. Additionally, even when kk is small, if we restrict nn to be 2n02n_0, we can show that a deep model has considerably more linear regions that a shallow one. We consider this as a first step towards understanding the complexity of these models and specifically towards providing suitable mathematical tools for future analysis.

View on arXiv
Comments on this paper