From Two Sample Testing to Singular Gaussian Discrimination

7 May 2025

Abstract

We establish that testing for the equality of two probability measures on a general separable and compact metric space is equivalent to testing for the singularity between two corresponding Gaussian measures on a suitable Reproducing Kernel Hilbert Space. The corresponding Gaussians are defined via the notion of kernel mean and covariance embedding of a probability measure. Discerning two singular Gaussians is fundamentally simpler from an information-theoretic perspective than non-parametric two-sample testing, particularly in high-dimensional settings. Our proof leverages the Feldman-Hajek criterion for singularity/equivalence of Gaussians on Hilbert spaces, and shows that discrepancies between distributions are heavily magnified through their corresponding Gaussian embeddings: at a population level, distinct probability measures lead to essentially separated Gaussian embeddings. This appears to be a new instance of the blessing of dimensionality that can be harnessed for the design of efficient inference tools in great generality.

View on arXiv

@article{santoro2025_2505.04613,
  title={ From Two Sample Testing to Singular Gaussian Discrimination },
  author={ Leonardo V. Santoro and Kartik G. Waghmare and Victor M. Panaretos },
  journal={arXiv preprint arXiv:2505.04613},
  year={ 2025 }
}

Comments on this paper