We study stochastic approximation algorithms with Markovian noise and constant step-size . We develop a method based on infinitesimal generator comparisons to study the bias of the algorithm, which is the expected difference between -- the value at iteration -- and -- the unique equilibrium of the corresponding ODE. We show that, under some smoothness conditions, this bias is of order . Furthermore, we show that the time-averaged bias is equal to , where is a constant characterized by a Lyapunov equation, showing that , where is the Polyak-Ruppert average. We also show that converges with high probability around . We illustrate how to combine this with Richardson-Romberg extrapolation to derive an iterative scheme with a bias of order .
View on arXiv