We prove an exponential decay concentration inequality to bound the tail probability of the difference between the log-likelihood of discrete random variables and the negative entropy. The concentration bound we derive holds uniformly over all parameter values. The new result improves the convergence rate in an earlier work \cite{zhao2020note}, from to , where is the sample size and is the number of possible values of the discrete variable. We further prove that the rate is optimal. The results are extended to misspecified log-likelihoods for grouped random variables.
View on arXiv