13
3

Improved Stein Variational Gradient Descent with Importance Weights

Abstract

Stein Variational Gradient Descent (SVGD) is a popular sampling algorithm used in various machine learning tasks. It is well known that SVGD arises from a discretization of the kernelized gradient flow of the Kullback-Leibler divergence DKL(π)D_{KL}\left(\cdot\mid\pi\right), where π\pi is the target distribution. In this work, we propose to enhance SVGD via the introduction of importance weights, which leads to a new method for which we coin the name β\beta-SVGD. In the continuous time and infinite particles regime, the time for this flow to converge to the equilibrium distribution π\pi, quantified by the Stein Fisher information, depends on ρ0\rho_0 and π\pi very weakly. This is very different from the kernelized gradient flow of Kullback-Leibler divergence, whose time complexity depends on DKL(ρ0π)D_{KL}\left(\rho_0\mid\pi\right). Under certain assumptions, we provide a descent lemma for the population limit β\beta-SVGD, which covers the descent lemma for the population limit SVGD when β0\beta\to 0. We also illustrate the advantages of β\beta-SVGD over SVGD by experiments.

View on arXiv
Comments on this paper