Control Variates for Stochastic Gradient MCMC

16 June 2017

Jack Baker

Abstract

It is well known that Markov chain Monte Carlo (MCMC) methods scale poorly with dataset size. We compare the performance of two classes of methods which aim to solve this issue: stochastic gradient MCMC (SGMCMC), and divide and conquer methods. We find an SGMCMC method, stochastic gradient Langevin dynamics (SGLD) to be the most robust in these comparisons. This method makes use of a noisy estimate of the gradient of the log posterior, which significantly reduces the per iteration computational cost of the algorithm. We analyse the algorithm over different dataset sizes and show, despite the per iteration saving, the computational cost is still proportional to the dataset size. We use control variates, a method to reduce the variance in Monte Carlo estimates, to reduce this computational cost to $O(1)$ . Next we show that a different control variate technique, known as zero variance control variates can be applied to SGMCMC algorithms for free. This post-processing step improves the inference of the algorithm by reducing the variance of the MCMC output. Zero variance control variates rely on the gradient of the log posterior; we explore how the variance reduction is affected by replacing this with the noisy gradient estimate calculated by SGMCMC.

View on arXiv

Comments on this paper