Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time

Robust covariance estimation is the following, well-studied problem in high dimensional statistics: given samples from a -dimensional Gaussian , but where an -fraction of the samples have been arbitrarily corrupted, output minimizing the total variation distance between and . This corresponds to learning in a natural affine-invariant variant of the Frobenius norm known as the \emph{Mahalanobis norm}. Previous work of Cheng et al demonstrated an algorithm that, given samples, achieved a near-optimal error of , and moreover, their algorithm ran in time , where is the time it takes to multiply a matrix by its transpose, and is the condition number of . When is relatively small, their polynomial dependence on in the runtime is prohibitively large. In this paper, we demonstrate a novel algorithm that achieves the same statistical guarantees, but which runs in time . In particular, our runtime has no dependence on . When is reasonably conditioned, our runtime matches that of the fastest algorithm for covariance estimation without outliers, up to poly-logarithmic factors, showing that we can get robustness essentially "for free."
View on arXiv