Rate Optimal Variational Bayesian Inference for Sparse DNN

10 October 2019

Abstract

Sparse deep neural network (DNN) has drawn much attention in recent studies because it possesses a great approximation power and is easier to implement and store in practice comparing to fully connected DNN. In this work, we consider variational Bayesian inference, a computationally efficient alternative to Markov chain Monte Carlo method, on the sparse DNN modeling under spike-and-slab prior. Our theoretical investigation shows that, for any $\alpha$ -H\"older smooth function, the variational posterior distribution shares the (near-)optimal contraction property, and the variation inference leads to (near-)optimal generalization error, as long as the network architecture is properly tuned according to smoothness parameter $\alpha$ . Furthermore, an adaptive variational inference procedure is developed to automatically select optimal network structure even when $\alpha$ is unknown. Our result also applies to the case that the truth is instead a ReLU neural network, and certain contraction bound is obtained.

View on arXiv

Comments on this paper