23
2

Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent

Krishnakumar Balasubramanian
Sayan Banerjee
Promit Ghosal
Abstract

We provide finite-particle convergence rates for the Stein Variational Gradient Descent (SVGD) algorithm in the Kernelized Stein Discrepancy (KSD\mathsf{KSD}) and Wasserstein-2 metrics. Our key insight is that the time derivative of the relative entropy between the joint density of NN particle locations and the NN-fold product target measure, starting from a regular initial distribution, splits into a dominant `negative part' proportional to NN times the expected KSD2\mathsf{KSD}^2 and a smaller `positive part'. This observation leads to KSD\mathsf{KSD} rates of order 1/N1/\sqrt{N}, in both continuous and discrete time, providing a near optimal (in the sense of matching the corresponding i.i.d. rates) double exponential improvement over the recent result by Shi and Mackey (2024). Under mild assumptions on the kernel and potential, these bounds also grow polynomially in the dimension dd. By adding a bilinear component to the kernel, the above approach is used to further obtain Wasserstein-2 convergence in continuous time. For the case of `bilinear + Mat\érn' kernels, we derive Wasserstein-2 rates that exhibit a curse-of-dimensionality similar to the i.i.d. setting. We also obtain marginal convergence and long-time propagation of chaos results for the time-averaged particle laws.

View on arXiv
Comments on this paper