A Berry-Esseen theorem for incomplete U-statistics with Bernoulli sampling

There has been a resurgence of interest in the asymptotic normality of incomplete U-statistics that only sum over roughly as many kernel evaluations as there are data samples, due to its computational efficiency and usefulness in quantifying the uncertainty for ensemble-based predictions. In this paper, we focus on the normal convergence of one such construction, the incomplete U-statistic with Bernoulli sampling, based on a raw sample of size and a computational budget . Under minimalistic moment assumptions on the kernel, we offer accompanying Berry-Esseen bounds of the natural rate that characterize the normal approximating accuracy involved when , i.e. and are of the same order in such a way that is lower-and-upper bounded by constants. Our key techniques include Stein's method specialized for the so-called Studentized nonlinear statistics, and an exponential lower tail bound for non-negative kernel U-statistics.
View on arXiv