Automatic Subspace Learning via Principal Coefficients Embedding

17 November 2014

Abstract

In this paper, we address two problems in unsupervised subspace learning: 1) how to automatically identify the feature dimension of the learned subspace, and 2) how to learn the underlying subspace in the presence of gross corruptions such as Gaussian noise. We show that these two problems are two sides of one coin, i.e. they can be solved by removing possible errors from training data $\mathbf{D}\in \mathds{R}^{m\times n}$ . To achieve this, we propose a new method (called Principal Coefficients Embedding, PCE) that can simultaneously learn a clean data set $\mathbf{D}_{0}\in \mathds{R}^{m\times n}$ and a linear representation (denoted by $\mathbf{C}$ ) from $\mathbf{D}$ . By embedding $\mathbf{C}$ into a $k$ -dimensional space, PCE obtains a projection matrix that preserves some desirable properties of inputs, where $k\ll m$ is exactly the rank of $\mathbf{C}$ . PCE has three advantages: 1) it can automatically determine the feature dimension even though data are sampled from a union of multiple linear subspaces; 2) it is robust to various noises and real disguises; 3) it has a closed-form solution and can be calculated very fast. Extensive experimental results show the superiority of PCE on a range of databases with respect to classification accuracy, robustness and efficiency.

View on arXiv

Comments on this paper