The Slope Heuristics in Heteroscedastic Regression

We consider the estimation of a regression function with random design and heteroscedastic noise in a non-parametric setting. More precisely, we address the problem of characterizing the optimal penalty when the regression function is estimated by using a penalized least-squares model selection method. In this context, we show the existence of a minimal penalty, defined to be the maximum level of penalization under which the model selection procedure totally misbehaves. Moreover, the optimal penalty is shown to be twice the minimal one and to satisfy a nonasymptotic pathwise oracle inequality with leading constant almost one. When the shape of the optimal penalty is known, this allows to apply the so-called slope heuristics initially proposed by Birg\'e and Massart (07), which further provides with a data-driven calibration of penalty procedure. Finally, the use of results previously obtained by the author (10), considering the least-squares estimation of a regression function on a fixed finite-dimensional linear model, allows us to go beyond the case of histogram models, which is already treated by Arlot and Massart (09).
View on arXiv