35

When Fourth Moments Are Enough

Abstract

This note concerns a somewhat innocent question motivated by an observation concerning the use of Chebyshev bounds on sample estimates of pp in the binomial distribution with parameters n,pn,p. Namely, what moment order produces the best Chebyshev estimate of pp? If Sn(p)S_n(p) has a binomial distribution with parameters n,pn,p, there it is readily observed that argmax0p1ESn2(p)=argmax0p1np(1p)=12,{\rm argmax}_{0\le p\le 1}{\mathbb E}S_n^2(p) = {\rm argmax}_{0\le p\le 1}np(1-p) = \frac12, and ESn2(12)=n4{\mathbb E}S_n^2(\frac12) = \frac{n}{4}. Rabi Bhattacharya observed that while the second moment Chebyshev sample size for a 95%95\% confidence estimate within ±5\pm 5 percentage points is n=2000n = 2000, the fourth moment yields the substantially reduced polling requirement of n=775n = 775. Why stop at fourth moment? Is the argmax achieved at p=12p = \frac12 for higher order moments and, if so, does it help, and compute ESn2m(12)\mathbb{E}S_n^{2m}(\frac12)? As captured by the title of this note, answers to these questions lead to a simple rule of thumb for best choice of moments in terms of an effective sample size for Chebyshev concentration inequalities.

View on arXiv
Comments on this paper