34
15

High-dimensional Gaussian model selection on a Gaussian design

Abstract

We consider the problem of estimating the conditional mean of a real Gaussian variable \nolinebreakY=i=1p\nolinebreakθiXi+\nolinebreakϵ\nolinebreak Y=\sum_{i=1}^p\nolinebreak\theta_iX_i+\nolinebreak \epsilon where the vector of the covariates (Xi)1ip(X_i)_{1\leq i\leq p} follows a joint Gaussian distribution. This issue often occurs when one aims at estimating the graph or the distribution of a Gaussian graphical model. We introduce a general model selection procedure which is based on the minimization of a penalized least-squares type criterion. It handles a variety of problems such as ordered and complete variable selection, allows to incorporate some prior knowledge on the model and applies when the number of covariates pp is larger than the number of observations nn. Moreover, it is shown to achieve a non-asymptotic oracle inequality independently of the correlation structure of the covariates. We also exhibit various minimax rates of estimation in the considered framework and hence derive adaptiveness properties of our procedure.

View on arXiv
Comments on this paper