770

Nearly-tight VC-dimension bounds for piecewise linear neural networks

Abstract

We prove new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting WW be the number of weights and LL be the number of layers, we prove that the VC-dimension is O(WLlog(W))O(W L \log(W)) and Ω(WLlog(W/L))\Omega( W L \log(W/L) ). This improves both the previously known upper bounds and lower bounds. In terms of the number UU of non-linear units, we prove a tight bound Θ(WU)\Theta(W U) on the VC-dimension. All of these results generalize to arbitrary piecewise linear activation functions.

View on arXiv
Comments on this paper