22
3

Spectral Guarantees for Adversarial Streaming PCA

Abstract

In streaming PCA, we see a stream of vectors x1,,xnRdx_1, \dotsc, x_n \in \mathbb{R}^d and want to estimate the top eigenvector of their covariance matrix. This is easier if the spectral ratio R=λ1/λ2R = \lambda_1 / \lambda_2 is large. We ask: how large does RR need to be to solve streaming PCA in O~(d)\widetilde{O}(d) space? Existing algorithms require R=Ω~(d)R = \widetilde{\Omega}(d). We show: (1) For all mergeable summaries, R=Ω~(d)R = \widetilde{\Omega}(\sqrt{d}) is necessary. (2) In the insertion-only model, a variant of Oja's algorithm gets o(1)o(1) error for R=O(lognlogd)R = O(\log n \log d). (3) No algorithm with o(d2)o(d^2) space gets o(1)o(1) error for R=O(1)R = O(1). Our analysis is the first application of Oja's algorithm to adversarial streams. It is also the first algorithm for adversarial streaming PCA that is designed for a spectral, rather than Frobenius, bound on the tail; and the bound it needs is exponentially better than is possible by adapting a Frobenius guarantee.

View on arXiv
Comments on this paper