ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 0811.0802
145
54
v1v2v3v4 (latest)

Optimal cross-validation in density estimation with the L2L^2L2-loss

5 November 2008
Alain Celisse
ArXiv (abs)PDFHTML
Abstract

We analyze the performance of cross-validation (CV) in the density estimation framework with two purposes: (i) risk estimation and (ii) model selection. The main focus is given to the so-called leave-ppp-out CV procedure (Lpo), where ppp denotes the cardinality of the test set. Closed-form expressions are settled for the Lpo estimator of the risk of projection estimators. These expressions provide a great improvement upon VVV-fold cross-validation in terms of variability and computational complexity. From a theoretical point of view, closed-form expressions also enable to study the Lpo performance in terms of risk estimation. The optimality of leave-one-out (Loo), that is Lpo with p=1p=1p=1, is proved among CV procedures used for risk estimation. Two model selection frameworks are also considered: estimation, as opposed to identification. For estimation with finite sample size nnn, optimality is achieved for ppp large enough [with p/n=o(1)p/n=o(1)p/n=o(1)] to balance the overfitting resulting from the structure of the model collection. For identification, model selection consistency is settled for Lpo as long as p/np/np/n is conveniently related to the rate of convergence of the best estimator in the collection: (i) p/n→1p/n\to1p/n→1 as n→+∞n\to+\inftyn→+∞ with a parametric rate, and (ii) p/n=o(1)p/n=o(1)p/n=o(1) with some nonparametric estimators. These theoretical results are validated by simulation experiments.

View on arXiv
Comments on this paper