Minimax Estimation of the Distance
We consider the problem of estimating the distance between two discrete probability measures and from empirical data in a nonasymptotic and large alphabet setting. When is known and one obtains samples from , we show that for every , the minimax rate-optimal estimator with samples achieves performance comparable to that of the maximum likelihood estimator (MLE) with samples. When both and are unknown, we construct minimax rate-optimal estimators whose worst case performance is essentially that of the known case with being uniform, implying that being uniform is essentially the most difficult case. The effective sample size enlargement phenomenon, identified in Jiao et al. (2015), holds both in the known case for every and the unknown case. However, the construction of optimal estimators for requires new techniques and insights beyond the Approximation methodology of functional estimation in Jiao et al. (2015).
View on arXiv