369

Gromov-Wasserstein Distances: Entropic Regularization, Duality, and Sample Complexity

Annals of Statistics (Ann. Stat.), 2022
Abstract

The Gromov-Wasserstein (GW) distance quantifies dissimilarity between metric measure spaces and provides a meaningful figure of merit for applications involving heterogeneous data. While computational aspects of the GW distance have been widely studied, a strong duality theory and fundamental statistical questions concerning empirical convergence rates remained obscure. This work closes these gaps for the (2,2)(2,2)-GW distance (namely, with quadratic cost) over Euclidean spaces of different dimensions dxd_x and dyd_y. We consider both the standard GW and the entropic GW (EGW) distances, derive their dual forms, and use them to analyze expected empirical convergence rates. The resulting rates are n2/max{dx,dy,4}n^{-2/\max\{d_x,d_y,4\}} (up to a log factor when max{dx,dy}=4\max\{d_x,d_y\}=4) and n1/2n^{-1/2} for the two-sample GW and EGW problems, respectively, which matches the corresponding rates for standard and entropic optimal transport distances. We also study stability of EGW in the entropic regularization parameter and establish approximation and continuity results for the cost and optimal couplings. Lastly, the duality is leveraged to shed new light on the open problem of the one-dimensional GW distance between uniform distributions on nn points, illuminating why the identity and anti-identity permutations may not be optimal. Our results serve as a first step towards a comprehensive statistical theory as well as computational advancements for GW distances, based on the discovered dual formulation.

View on arXiv
Comments on this paper