50
21

Streaming k-PCA: Efficient guarantees for Oja's algorithm, beyond rank-one updates

Abstract

We analyze Oja's algorithm for streaming kk-PCA and prove that it achieves performance nearly matching that of an optimal offline algorithm. Given access to a sequence of i.i.d. d×dd \times d symmetric matrices, we show that Oja's algorithm can obtain an accurate approximation to the subspace of the top kk eigenvectors of their expectation using a number of samples that scales polylogarithmically with dd. Previously, such a result was only known in the case where the updates have rank one. Our analysis is based on recently developed matrix concentration tools, which allow us to prove strong bounds on the tails of the random matrices which arise in the course of the algorithm's execution.

View on arXiv
Comments on this paper