29
3

SoS Certifiability of Subgaussian Distributions and its Algorithmic Applications

Abstract

We prove that there is a universal constant C>0C>0 so that for every dNd \in \mathbb N, every centered subgaussian distribution D\mathcal D on Rd\mathbb R^d, and every even pNp \in \mathbb N, the dd-variate polynomial (Cp)p/2v2pEXDv,Xp(Cp)^{p/2} \cdot \|v\|_{2}^p - \mathbb E_{X \sim \mathcal D} \langle v,X\rangle^p is a sum of square polynomials. This establishes that every subgaussian distribution is \emph{SoS-certifiably subgaussian} -- a condition that yields efficient learning algorithms for a wide variety of high-dimensional statistical tasks. As a direct corollary, we obtain computationally efficient algorithms with near-optimal guarantees for the following tasks, when given samples from an arbitrary subgaussian distribution: robust mean estimation, list-decodable mean estimation, clustering mean-separated mixture models, robust covariance-aware mean estimation, robust covariance estimation, and robust linear regression. Our proof makes essential use of Talagrand's generic chaining/majorizing measures theorem.

View on arXiv
Comments on this paper