Stochastic Low-Rank Subspace Clustering by Auxiliary Variable Modeling
Low-Rank Representation (LRR) is a popular tool to identify the data that are generated by a union of subspaces. However, the size of the regularized matrix of LRR is proportional to , where is the number of samples, which hinders LRR for large scale problems. In this paper, we study how to scale up the LRR method accurately and memory efficiently. In particular, we propose an online implementation that reduces the memory cost from to , with being the ambient dimension and being some expected rank. There are two key techniques in our algorithm: one is to reformulate the nuclear norm to an equivalent matrix factorization form and the other is to introduce an auxiliary variable working as a basis dictionary of the underlying data, which combined together makes the problem amenable to stochastic optimization. We establish a theoretical guarantee that the sequence of solutions produced by our algorithm will converge to a stationary point asymptotically. Numerical experiments on subspace recovery and subspace clustering tasks demonstrate the efficacy and robustness of the proposed algorithm.
View on arXiv