291

Limit Distribution Theory for Smooth Wasserstein Distance with Applications to Generative Modeling

Abstract

The 1-Wasserstein distance (W1\mathsf{W}_1) is a popular proximity measure between probability distributions. Its metric structure, robustness to support mismatch, and rich geometric structure fueled its wide adoption for machine learning tasks. Such tasks inherently rely on approximating distributions from data. This surfaces a central issue -- empirical approximation under Wasserstein distances suffers from the curse of dimensionality, converging at rate n1/dn^{-1/d} where nn is the sample size and dd is the data dimension; this rate drastically deteriorates in high dimensions. To circumvent this impasse, we adopt the framework of Gaussian smoothed Wasserstein distance W1(σ)\mathsf{W}_1^{(\sigma)}, where both probability measures are convolved with an isotropic Gaussian distribution of parameter σ>0\sigma > 0. In remarkable contrast to classic W1\mathsf{W}_1, the empirical convergence rate under W1(σ)\mathsf{W}_1^{(\sigma)} is n1/2n^{-1/2} in all dimensions. Inspired by this fact, the present paper conducts an in-depth study of the statistical properties of the smooth Wasserstein distance. We derive the limit distribution of nW1(σ)(Pn,P)\sqrt{n}\mathsf{W}_1^{(\sigma)}(P_n,P) for all dd, where PnP_n is the empirical measure of nn independent observations from PP. In arbitrary dimension, the limit is characterized as the supremum of a tight Gaussian process indexed by 1-Lipschitz functions convolved with a Gaussian density. Building on this result we derive concentration inequalities, bootstrap consistency, and explore generative modeling with W1(σ)\mathsf{W}_1^{(\sigma)} under the minimum distance estimation framework. For the latter, we derive measurability, almost sure convergence, and limit distributions for optimal generative models and their corresponding smooth Wasserstein error. These results promote the smooth Wasserstein distance as a powerful tool for statistical inference in high dimensions.

View on arXiv
Comments on this paper