In this work, we study the idea of variance reduction applied to adaptive stochastic mirror descent algorithms in the nonsmooth nonconvex finite-sum optimization problems. We propose a simple yet generalized adaptive mirror descent algorithm with variance reduction named SVRAMD and provide its convergence analysis in different settings. We prove that variance reduction reduces the SFO complexity of most adaptive mirror descent algorithms and accelerates their convergence. In particular, our general theory implies that variance reduction can be applied to algorithms using time-varying step sizes and self-adaptive algorithms such as AdaGrad and RMSProp. Moreover, the convergence rates of SVRAMD recover the best existing rates of non-adaptive variance reduced mirror descent algorithms. We check the validity of our claims using experiments in deep learning.
View on arXiv