291

Statistically efficient thinning of a Markov chain sampler

Abstract

It is common to subsample Markov chain samples to reduce the storage burden of the output. It is also well known that discarding k1k-1 out of every kk observations will not improve statistical efficiency. It is less frequently remarked that subsampling a Markov chain allows one to omit some of the computation beyond that needed to simply advance the chain. When this reduced computation is accounted for, thinning the Markov chain by subsampling it can improve statistical efficiency. Given an autocorrelation parameter ρ\rho and a cost ratio θ\theta, this paper shows how to compute the most efficient subsampling frequency kk. The optimal kk grows rapidly as ρ\rho increases towards 11. The resulting efficiency gain depends primarily on θ\theta, not ρ\rho. Taking k=1k=1 (no thinning) is optimal when ρ0\rho\le0. For ρ>0\rho>0 it is optimal if and only if θ(1ρ)2/(2ρ)\theta \le (1-\rho)^2/(2\rho).

View on arXiv
Comments on this paper