Rate Optimal Variational Bayesian Inference for Sparse DNN
- BDL
Sparse deep neural network (DNN) has drawn much attention in recent studies because it possesses a great approximation power and is easier to implement and store in practice comparing to fully connected DNN. In this work, we consider variational Bayesian inference, a computationally efficient alternative to Markov chain Monte Carlo method, on the sparse DNN modeling under spike-and-slab prior. Our theoretical investigation shows that, for any -H\"older smooth function, the variational posterior distribution shares the (near-)optimal contraction property, and the variation inference leads to (near-)optimal generalization error, as long as the network architecture is properly tuned according to smoothness parameter . Furthermore, an adaptive variational inference procedure is developed to automatically select optimal network structure even when is unknown. Our result also applies to the case that the truth is instead a ReLU neural network, and certain contraction bound is obtained.
View on arXiv