263

Wide stochastic networks: Gaussian limit and PAC-Bayesian training

International Conference on Algorithmic Learning Theory (ALT), 2021
George Deligiannidis
Abstract

The limit of infinite width allows for substantial simplifications in the analytical study of overparameterized neural networks. With a suitable random initialization, an extremely large network is well approximated by a Gaussian process, both before and during training. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables. The explicit evaluation of the output distribution allows for a PAC-Bayesian training procedure that directly optimizes the generalization bound. For a large but finite-width network, we show empirically on MNIST that this training approach can outperform standard PAC-Bayesian methods.

View on arXiv
Comments on this paper