Convex programming approach to robust estimation of a multivariate
Gaussian model

Multivariate Gaussian distribution is often used as a first approximation to the distribution of high-dimensional data. Determining the parameters of this distribution under various constraints is a widely studied problem in statistics, and is often considered as a prototype for testing new algorithms or theoretical frameworks. In this paper, we develop a nonasymptotic approach to the problem of estimating the parameters of a multivariate Gaussian distribution when data are corrupted by outliers. We propose an estimator-efficiently computable by solving a convex program-that robustly estimates the population mean and the population covariance matrix even when the sample contains a significant proportion of outliers. In the case where the dimension of the data points is of smaller order than the sample size, our estimator of the corruption matrix is provably rate optimal simultaneously for the entry-wise -norm, the Frobenius norm and the mixed norm. Furthermore, this optimality is achieved by a penalized square-root-of-least-squares method with a universal tuning parameter (calibrating the strength of the penalization). These results are partly extended to the case where is potentially larger than , under the additional condition that the inverse covariance matrix is sparse.
View on arXiv