Finite Sample Properties of Adaptive Markov Chains via Curvature
Adaptive Markov chains are an important class of Monte Carlo methods for sampling from probability distributions. The time evolution of adaptive algorithms depends on the past samples, and thus these algorithms are non-Markovian. Although there has been previous work establishing conditions for their ergodicity, not much is known theoretically about their finite sample properties. In this paper, using a variant of the discrete Ricci curvature for Markov kernels introduced by Ollivier, we establish concentration inequalities and finite sample bounds for a class of adaptive Markov chains. After establishing some general results, we work out two examples. In the first example, we give quantitative bounds for `multi-level' adaptive algorithms such as the equi-energy sampler. We also provide the first rigorous proofs that the finite sample properties obtained from an equi-energy sampler are superior to those obtained from related parallel tempering and MCMC samplers. We also discuss some qualitative differences between the equi-energy sampler, which often eventually satisfies a `positive curvature' condition, and parallel tempering, which does not. In the second example, we analyze a simple adaptive version of the usual random walk on and show that the mixing time improves from to . Finally, we discuss the use of simulation methods for obtaining rigorous mixing bounds for adaptive algorithms.
View on arXiv