353

VV-fold cross-validation and VV-fold penalization in least-squares density estimation

Abstract

This paper studies VV-fold cross-validation for model selection in least-squares density estimation. The goal is to provide theoretical grounds for choosing VV in order to minimize the least-squares risk of the selected estimator. % We first prove a non asymptotic oracle inequality for VV-fold cross-validation and its bias-corrected version (VV-fold penalization), with an upper bound decreasing as a function of VV. In particular, this result implies VV-fold penalization is asymptotically optimal. % Then, we compute the variance of VV-fold cross-validation and related criteria, as well as the variance of key quantities for model selection performances. We show these variances depend on VV like 1+1/(V1)1+1/(V-1) (at least in some particular cases), suggesting the performances increase much from V=2 to V=5 or 10, and then is almost constant. % Overall, this explains the common advice to take $V=10 $---at least in our setting and when the computational power is limited---, as confirmed by some simulation experiments.

View on arXiv
Comments on this paper