Minimax rates of entropy estimation on large alphabets via best polynomial approximation

1 July 2014

Pengkun Yang

Abstract

Consider the problem of estimating the Shannon entropy of a distribution on $k$ elements from $n$ independent samples. We show that the minimax mean-square error is within universal multiplicative constant factors of \Big(\frac{k }{n \log k}\Big)^2 + \frac{\log^2 k}{n} as long as $n$ grows no faster than a polynomial of $k$ . This implies the recent result of Valiant-Valiant \cite{VV11} that the minimal sample size for consistent entropy estimation scales according to $\Theta(\frac{k}{\log k})$ . The apparatus of best polynomial approximation plays a key role in both the minimax lower bound and the construction of optimal estimators.

View on arXiv

Comments on this paper