Robust Sparse Mean Estimation via Sum of Squares

We study the problem of high-dimensional sparse mean estimation in the presence of an -fraction of adversarial outliers. Prior work obtained sample and computationally efficient algorithms for this task for identity-covariance subgaussian distributions. In this work, we develop the first efficient algorithms for robust sparse mean estimation without a priori knowledge of the covariance. For distributions on with "certifiably bounded" -th moments and sufficiently light tails, our algorithm achieves error of with sample complexity . For the special case of the Gaussian distribution, our algorithm achieves near-optimal error of with sample complexity . Our algorithms follow the Sum-of-Squares based, proofs to algorithms approach. We complement our upper bounds with Statistical Query and low-degree polynomial testing lower bounds, providing evidence that the sample-time-error tradeoffs achieved by our algorithms are qualitatively the best possible.
View on arXiv