l_p-Recovery of the Most Significant Subspace among Multiple Subspaces with Outliers

18 December 2010

Gilad Lerman

Teng Zhang

Abstract

We assume data sampled from a mixture of d-dimensional linear subspaces with spherically symmetric outliers. We study the recovery of the global l0 subspace (i.e., with largest number of points) by minimizing the lp-averaged distances of data points from d-dimensional subspaces of R^D, where p>0. Unlike other lp minimization problems, this minimization is non-convex for all p>0 and thus requires different methods for its analysis. We show that if 0<p<=1, then the global l0 subspace can be recovered by lp minimization with overwhelming probability (which depends on the generating distribution and its parameters). Moreover, when adding homoscedastic noise around the underlying subspaces, then with overwhelming probability the generalized l0 subspace (with largest number of points "around it") can be nearly recovered by lp minimization with an error proportional to the noise level. On the other hand, if p>1 and there is more than one underlying subspace, then with overwhelming probability the global l0 subspace cannot be recovered and the generalized one cannot even be nearly recovered.

View on arXiv

Comments on this paper