389
v1v2v3v4v5v6v7 (latest)

Minimax Estimation of the L1L_1 Distance

Abstract

We consider the problem of estimating the L1L_1 distance between two discrete probability measures PP and QQ from empirical data in a nonasymptotic and large alphabet setting. When QQ is known and one obtains nn samples from PP, we show that for every QQ, the minimax rate-optimal estimator with nn samples achieves performance comparable to that of the maximum likelihood estimator (MLE) with nlnnn\ln n samples. When both PP and QQ are unknown, we construct minimax rate-optimal estimators whose worst case performance is essentially that of the known QQ case with QQ being uniform, implying that QQ being uniform is essentially the most difficult case. The \emph{effective sample size enlargement} phenomenon, identified in Jiao \emph{et al.} (2015), holds both in the known QQ case for every QQ and the QQ unknown case. However, the construction of optimal estimators for PQ1\|P-Q\|_1 requires new techniques and insights beyond the approximation-based method of functional estimation in Jiao \emph{et al.} (2015).

View on arXiv
Comments on this paper