Concentration of the missing mass in metric spaces
Abstract
We study the estimation of the probability to observe data further than a specified distance from a given iid sample in a metric space. The problem extends the classical problem of estimation of the missing mass in discrete spaces. We show that estimation is difficult in general and identify conditions on the distribution, under which the Good-Turing estimator and the conditional missing mass concentrate on their expectations. Applications to supervised learning are sketched.
View on arXivComments on this paper
