210

Stochastic Low-Rank Subspace Clustering by Auxiliary Variable Modeling

Abstract

Low-Rank Representation (LRR) is a popular tool to identify the data that are generated by a union of subspaces. However, the size of the regularized matrix of LRR is proportional to n2n^2, where nn is the number of samples, which hinders LRR for large scale problems. In this paper, we study how to scale up the LRR method accurately and memory efficiently. In particular, we propose an online implementation that reduces the memory cost from O(n2)O(n^2) to O(pd)O(pd), with pp being the ambient dimension and dd being some expected rank. There are two key techniques in our algorithm: one is to reformulate the nuclear norm to an equivalent matrix factorization form and the other is to introduce an auxiliary variable working as a basis dictionary of the underlying data, which combined together makes the problem amenable to stochastic optimization. We establish a theoretical guarantee that the sequence of solutions produced by our algorithm will converge to a stationary point asymptotically. Numerical experiments on subspace recovery and subspace clustering tasks demonstrate the efficacy and robustness of the proposed algorithm.

View on arXiv
Comments on this paper