361

On the number of inference regions of deep feed forward networks with piece-wise linear activations

Abstract

This paper explores the complexity of deep feed forward networks with linear pre-synaptic couplings and rectified linear activations. This is a contribution to the growing body of work contrasting the representational power of deep and shallow network architectures. In particular, we offer a framework for comparing deep and shallow models that belong to the family of piecewise linear functions based on computational geometry. We look at a deep rectifier multi-layer perceptron (MLP) with linear outputs units and compare it with a single layer version of the model. In the asymptotic regime, when the number of inputs stays constant, if the shallow model has knkn hidden units and n0n_0 inputs, then the number of linear regions is O(kn0nn0)O(k^{n_0}n^{n_0}). For a kk layer model with nn hidden units on each layer it is Ω((n/n0)k1nn0)\Omega(\left( {n}/{n_0}\right)^{k-1}n^{n_0}). (n/n0)k1\left({n}/{n_0}\right)^{k-1} grows faster then kn0k^{n_0} when either nn goes to infinity or kk goes to infinity and n>2n0n > 2n_0. We consider this as a first step towards understanding the complexity of these models and specifically towards providing suitable mathematical tools for future analysis.

View on arXiv
Comments on this paper