92

Random Matrix-Improved Estimation of the Wasserstein Distance between two Centered Gaussian Distributions

Abstract

This article proposes a method to consistently estimate functionals 1pi=1pf(λi(C1C2))\frac1p\sum_{i=1}^pf(\lambda_i(C_1C_2)) of the eigenvalues of the product of two covariance matrices C1,C2Rp×pC_1,C_2\in\mathbb{R}^{p\times p} based on the empirical estimates λi(C^1C^2)\lambda_i(\hat C_1\hat C_2) (C^a=1nai=1naxi(a)xi(a)T\hat C_a=\frac1{n_a}\sum_{i=1}^{n_a} x_i^{(a)}x_i^{(a){{\sf T}}}), when the size pp and number nan_a of the (zero mean) samples xi(a)x_i^{(a)} are similar. As a corollary, a consistent estimate of the Wasserstein distance (related to the case f(t)=tf(t)=\sqrt{t}) between centered Gaussian distributions is derived. The new estimate is shown to largely outperform the classical sample covariance-based `plug-in' estimator. Based on this finding, a practical application to covariance estimation is then devised which demonstrates potentially significant performance gains with respect to state-of-the-art alternatives.

View on arXiv
Comments on this paper