List-Decodable Linear Regression
We give the first polynomial-time algorithm for robust regression in the list-decodable setting where an adversary can corrupt a greater than fraction of examples. For any , our algorithm takes as input a sample of linear equations where of the equations satisfy for some small noise and of the equations are \emph{arbitrarily} chosen. It outputs a list of size - a fixed constant - that contains an that is close to . Our algorithm succeeds whenever the inliers are chosen from a \emph{certifiably} anti-concentrated distribution . As a special case, this yields a time algorithm to find a size list when the inlier distribution is a standard Gaussian. The anti-concentration assumption on the inliers is information-theoretically necessary. Our algorithm works for more general distributions under the additional assumption that is Boolean valued. To solve the problem we introduce a new framework for list-decodable learning that strengthens the sum-of-squares `identifiability to algorithms' paradigm. In an independent work, Raghavendra and Yau [RY19] have obtained a similar result for list-decodable regression also using the sum-of-squares method.
View on arXiv