A new method for combining several initial estimators of the regression function is introduced. Instead of building a linear or convex optimized combination over a collection of basic estimators , we use them as a collective indicator of the distance between the training data and a test observation. This local distance approach is model-free and extremely fast. Most importantly, the resulting collective estimator is shown to perform asymptotically at least as well in the sense as the best basic estimator in the collective. Moreover, it does so without having to declare which might be the best basic estimator for the given data set. A companion R package called \cobra (standing for COmBined Regression Alternative) is presented (downloadable on \url{http://cran.r-project.org/web/packages/COBRA/index.html}). Numerical evidence is provided on both synthetic and real data sets to assess the excellent performance of our method in a large variety of prediction problems.
View on arXiv