391
v1v2v3 (latest)

Minimax rates of entropy estimation on large alphabets via best polynomial approximation

IEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2014
Pengkun Yang
Abstract

Consider the problem of estimating the Shannon entropy of a distribution over kk elements from nn independent samples. We show that the minimax mean-square error is within universal multiplicative constant factors of (knlogk)2+log2kn\Big(\frac{k }{n \log k}\Big)^2 + \frac{\log^2 k}{n} if nn exceeds a constant factor of klogk\frac{k}{\log k}; otherwise there exists no consistent estimator. This refines the recent result of Valiant-Valiant \cite{VV11} that the minimal sample size for consistent entropy estimation scales according to Θ(klogk)\Theta(\frac{k}{\log k}). The apparatus of best polynomial approximation plays a key role in both the construction of optimal estimators and, via a duality argument, the minimax lower bound.

View on arXiv
Comments on this paper