Universal consistency of Wasserstein -NN classifier: Negative and Positive Results

The Wasserstein distance provides a notion of dissimilarities between probability measures, which has recent applications in learning of structured data with varying size such as images and text documents. In this work, we study the -nearest neighbor classifier (-NN) of probability measures under the Wasserstein distance. We show that the -NN classifier is not universally consistent on the space of measures supported in . As any Euclidean ball contains a copy of , one should not expect to obtain universal consistency without some restriction on the base metric space, or the Wasserstein space itself. To this end, via the notion of -finite metric dimension, we show that the -NN classifier is universally consistent on spaces of measures supported in a -uniformly discrete set. In addition, by studying the geodesic structures of the Wasserstein spaces for and , we show that the -NN classifier is universally consistent on the space of measures supported on a finite set, the space of Gaussian measures, and the space of measures with densities expressed as finite wavelet series.
View on arXiv