ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1211.1041
126
116
v1v2v3 (latest)

Algorithms and Hardness for Robust Subspace Recovery

5 November 2012
Moritz Hardt
Ankur Moitra
    OOD
ArXiv (abs)PDFHTML
Abstract

We consider a fundamental problem in unsupervised learning called \emph{subspace recovery}: given a collection of mmm points in Rn\mathbb{R}^nRn, if many but not necessarily all of these points are contained in a ddd-dimensional subspace TTT can we find it? The points contained in TTT are called {\em inliers} and the remaining points are {\em outliers}. This problem has received considerable attention in computer science and in statistics. Yet efficient algorithms from computer science are not robust to {\em adversarial} outliers, and the estimators from robust statistics are hard to compute in high dimensions. Are there algorithms for subspace recovery that are both robust to outliers and efficient? We give an algorithm that finds TTT when it contains more than a dn\frac{d}{n}nd​ fraction of the points. Hence, for say d=n/2d = n/2d=n/2 this estimator is both easy to compute and well-behaved when there are a constant fraction of outliers. We prove that it is Small Set Expansion hard to find TTT when the fraction of errors is any larger, thus giving evidence that our estimator is an {\em optimal} compromise between efficiency and robustness. As it turns out, this basic problem has a surprising number of connections to other areas including small set expansion, matroid theory and functional analysis that we make use of here.

View on arXiv
Comments on this paper