Mini-batch Tempered MCMC

In this paper we propose a general framework of performing MCMC with only a mini-batch of data. We show by estimating the Metropolis-Hasting ratio with only a mini-batch of data, one is essentially sampling from the true posterior raised to a known temperature. We show by experiments that our method, Mini-batch Tempered MCMC (MINT-MCMC), can efficiently explore multiple modes of a posterior distribution. We demonstrate the application of MINT-MCMC as an inference tool for Bayesian neural networks. We also show an cyclic version of our algorithm can be applied to build an ensemble of neural networks with little additional training cost. Based on the Equi-Energy sampler (Kou et al. 2006), we developed a new parallel MCMC algorithm based on the Equi-Energy sampler, which enables efficient sampling from high-dimensional multi-modal posteriors with well separated modes.
View on arXiv