Stochastic Particle-Optimization Sampling and the Non-Asymptotic
Convergence Theory
Particle-optimization-based sampling (POS) is a recently developed effective sampling technique that interactively updates a set of particles. A representative algorithm is the Stein variational gradient descent (SVGD). Though obtaining significant empirical success, {\em non-asymptotic} convergence behaviors of SVGD remain largely unknown. We prove, under certain conditions, SVGD experiences a theoretical pitfall, where particles tend to collapse. As a remedy, we generalize POS to a stochasticity setting by injecting random noise into particle updates, thus termed stochastic particle-optimization sampling (SPOS). Notably, for the first time, we develop {\em non-asymptotic convergence theory} for the SPOS framework (related to SVGD), characterizing algorithm convergence in terms of the 1-Wasserstein distance w.r.t.\! the number of particles and iterations, under both convex- and noncovex-energy-function settings. Somewhat surprisingly, with the same number of updates (not too large) for each particle, our theory suggests adopting more particles does not necessarily lead to a better approximation of a target distribution, due to limited computational budget and numerical errors. This phenomenon is also observed in SVGD and verified via a synthetic experiment. Our development is based on analysis of nonlinear partial differential equations. Experimental results verify our theory and demonstrate the effectiveness of our proposed framework.
View on arXiv