14
12

A Fast Factorization-based Approach to Robust PCA

Abstract

Robust principal component analysis (RPCA) has been widely used for recovering low-rank matrices in many data mining and machine learning problems. It separates a data matrix into a low-rank part and a sparse part. The convex approach has been well studied in the literature. However, state-of-the-art algorithms for the convex approach usually have relatively high complexity due to the need of solving (partial) singular value decompositions of large matrices. A non-convex approach, AltProj, has also been proposed with lighter complexity and better scalability. Given the true rank rr of the underlying low rank matrix, AltProj has a complexity of O(r2dn)O(r^2dn), where d×nd\times n is the size of data matrix. In this paper, we propose a novel factorization-based model of RPCA, which has a complexity of O(kdn)O(kdn), where kk is an upper bound of the true rank. Our method does not need the precise value of the true rank. From extensive experiments, we observe that AltProj can work only when rr is precisely known in advance; however, when the needed rank parameter rr is specified to a value different from the true rank, AltProj cannot fully separate the two parts while our method succeeds. Even when both work, our method is about 4 times faster than AltProj. Our method can be used as a light-weight, scalable tool for RPCA in the absence of the precise value of the true rank.

View on arXiv
Comments on this paper