12
42

Robust Shift-and-Invert Preconditioning: Faster and More Sample Efficient Algorithms for Eigenvector Computation

Abstract

We provide faster algorithms and improved sample complexities for approximating the top eigenvector of a matrix. Offline Setting: Given an n×dn \times d matrix AA, we show how to compute an ϵ\epsilon approximate top eigenvector in time O~([nnz(A)+dsr(A)gap2]log1/ϵ)\tilde O ( [nnz(A) + \frac{d \cdot sr(A)}{gap^2}]\cdot \log 1/\epsilon ) and O~([nnz(A)3/4(dsr(A))1/4gap]log1/ϵ)\tilde O([\frac{nnz(A)^{3/4} (d \cdot sr(A))^{1/4}}{\sqrt{gap}}]\cdot \log1/\epsilon ). Here sr(A)sr(A) is the stable rank and gapgap is the multiplicative eigenvalue gap. By separating the gapgap dependence from nnz(A)nnz(A) we improve on the classic power and Lanczos methods. We also improve prior work using fast subspace embeddings and stochastic optimization, giving significantly improved dependencies on sr(A)sr(A) and ϵ\epsilon. Our second running time improves this further when nnz(A)dsr(A)gap2nnz(A) \le \frac{d\cdot sr(A)}{gap^2}. Online Setting: Given a distribution DD with covariance matrix Σ\Sigma and a vector x0x_0 which is an O(gap)O(gap) approximate top eigenvector for Σ\Sigma, we show how to refine to an ϵ\epsilon approximation using O~(v(D)gap2+v(D)gapϵ)\tilde O(\frac{v(D)}{gap^2} + \frac{v(D)}{gap \cdot \epsilon}) samples from DD. Here v(D)v(D) is a natural variance measure. Combining our algorithm with previous work to initialize x0x_0, we obtain a number of improved sample complexity and runtime results. For general distributions, we achieve asymptotically optimal accuracy as a function of sample size as the number of samples grows large. Our results center around a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast SVRG based approximate system solvers to achieve our claims. We believe our results suggest the general effectiveness of shift-and-invert based approaches and imply that further computational gains may be reaped in practice.

View on arXiv
Comments on this paper