v1v2v3v4 (latest)

Scalable Learning from Probability Measures with Mean Measure Quantization

7 February 2025

Erell Gachon

Jérémie Bigot

Elsa Cazelles

ArXiv (abs)PDF HTML Github

Main:12 Pages

7 Figures

Bibliography:3 Pages

Appendix:11 Pages

Abstract

We consider statistical learning problems in which data are observed as a set of probability measures. Optimal transport (OT) is a popular tool to compare and manipulate such objects, but its computational cost becomes prohibitive when the measures have large support. We study a quantization-based approach in which all input measures are approximated by $K$ -point discrete measures sharing a common support. We establish consistency of the resulting quantized measures. We further derive convergence guarantees for several OT-based downstream tasks computed from the quantized measures. Numerical experiments on synthetic and real datasets demonstrate that the proposed approach achieves performance comparable to individual quantization while substantially reducing runtime.

View on arXiv

Comments on this paper