344

Information Theoretic Sample Complexity Lower Bound for Feed-Forward Fully-Connected Deep Networks

Abstract

In this paper, we study the sample complexity lower bound of a dd-layer feed-forward, fully-connected neural network for binary classification, using information-theoretic tools. Specifically, we propose a backward data generating process, where the input is generated based on the binary output, and the network is parametrized by weight parameters for the hidden layers. The sample complexity lower bound is of order Ω(log(r)+p/(rd))\Omega(\log(r) + p / (r d)), where pp is the dimension of the input, rr is the rank of the weight matrices, and dd is the number of hidden layers. To the best of our knowledge, our result is the first information theoretic sample complexity lower bound.

View on arXiv
Comments on this paper