20
0

A Berry-Esseen theorem for incomplete U-statistics with Bernoulli sampling

Dennis Leung
Abstract

There has been a resurgence of interest in the asymptotic normality of incomplete U-statistics that only sum over roughly as many kernel evaluations as there are data samples, due to its computational efficiency and usefulness in quantifying the uncertainty for ensemble-based predictions. In this paper, we focus on the normal convergence of one such construction, the incomplete U-statistic with Bernoulli sampling, based on a raw sample of size nn and a computational budget NN. Under minimalistic moment assumptions on the kernel, we offer accompanying Berry-Esseen bounds of the natural rate 1/min(N,n)1/\sqrt{\min(N, n)} that characterize the normal approximating accuracy involved when nNn \asymp N, i.e. nn and NN are of the same order in such a way that n/Nn/N is lower-and-upper bounded by constants. Our key techniques include Stein's method specialized for the so-called Studentized nonlinear statistics, and an exponential lower tail bound for non-negative kernel U-statistics.

View on arXiv
Comments on this paper