Statistically efficient thinning of a Markov chain sampler

27 October 2015

Abstract

It is common to subsample Markov chain samples to reduce the storage burden of the output. It is also well known that discarding $k-1$ out of every $k$ observations will not improve statistical efficiency. It is less frequently remarked that subsampling a Markov chain allows one to omit some of the computation beyond that needed to simply advance the chain. When this reduced computation is accounted for, thinning the Markov chain by subsampling it can improve statistical efficiency. Given an autocorrelation parameter $\rho$ and a cost ratio $\theta$ , this paper shows how to compute the most efficient subsampling frequency $k$ . The optimal $k$ grows rapidly as $\rho$ increases towards $1$ . The resulting efficiency gain depends primarily on $\theta$ , not $\rho$ . Taking $k=1$ (no thinning) is optimal when $\rho\le0$ . For $\rho>0$ it is optimal if and only if $\theta \le (1-\rho)^2/(2\rho)$ .

View on arXiv

Comments on this paper