327

Regularization and Computation with high-dimensional spike-and-slab posterior distributions

Abstract

We consider the Bayesian analysis of a high-dimensional statistical model with a spike-and-slab prior, and we study the forward-backward envelop of the posterior distribution -- denoted Πˇγ\check\Pi_{\gamma} for some regularization parameter γ>0\gamma>0. Viewing Πˇγ\check\Pi_\gamma as a pseudo-posterior distribution, we work out a set of sufficient conditions under which it contracts towards the true value of the parameter as γ0\gamma\downarrow 0, and pp (the dimension of the parameter space) diverges to \infty. In linear regression models the contraction rate matches the contraction rate of the true posterior distribution. We also study a practical Markov Chain Monte Carlo (MCMC) algorithm to sample from Πˇγ\check\Pi_{\gamma}. In the particular case of the linear regression model, and focusing on models with high signal-to-noise ratios, we show that the mixing time of the MCMC algorithm depends crucially on the coherence of the design matrix, and on the initialization of the Markov chain. In the most favorable cases, we show that the computational complexity of the algorithm scales with the dimension pp as O(pes2)O(pe^{s_\star^2}), where ss_\star is the number of non-zeros components of the true parameter. We provide some simulation results to illustrate the theory. Our simulation results also suggest that the proposed algorithm (as well as a version of the Gibbs sampler of Narisetti and He (2014)) mix poorly when poorly initialized, or if the design matrix has high coherence.

View on arXiv
Comments on this paper