Stochastic Particle-Optimization Sampling and the Non-Asymptotic Convergence Theory

5 September 2018

Jianyi Zhang

Abstract

Particle-optimization-based sampling (POS) is a recently developed effective sampling technique that interactively updates a set of particles. A representative algorithm is the Stein variational gradient descent (SVGD). Though obtaining significant empirical success, {\em non-asymptotic} convergence behaviors of SVGD remain largely unknown. We prove, under certain conditions, SVGD experiences a theoretical pitfall, where particles tend to collapse. As a remedy, we generalize POS to a stochasticity setting by injecting random noise into particle updates, thus termed stochastic particle-optimization sampling (SPOS). Notably, for the first time, we develop {\em non-asymptotic convergence theory} for the SPOS framework (related to SVGD), characterizing algorithm convergence in terms of the 1-Wasserstein distance w.r.t.\! the number of particles and iterations, under both convex- and noncovex-energy-function settings. Somewhat surprisingly, with the same number of updates (not too large) for each particle, our theory suggests adopting more particles does not necessarily lead to a better approximation of a target distribution, due to limited computational budget and numerical errors. This phenomenon is also observed in SVGD and verified via a synthetic experiment. Our development is based on analysis of nonlinear partial differential equations. Experimental results verify our theory and demonstrate the effectiveness of our proposed framework.

View on arXiv

Comments on this paper