Tight Query Complexity Lower Bounds for PCA via Finite Sample Deformed Wigner Law

We prove a \emph{query complexity} lower bound for approximating the top dimensional eigenspace of a matrix. We consider an oracle model where, given a symmetric matrix , an algorithm is allowed to make exact queries of the form for in , where is drawn from a distribution which depends arbitrarily on the past queries and measurements . We show that for every , there exists a distribution over matrices for which 1) (where is the normalized gap between the and -st largest-magnitude eigenvector of ), and 2) any algorithm which takes fewer than queries fails (with overwhelming probability) to identity a matrix with orthonormal columns for which . Our bound requires only that is a small polynomial in and , and matches the upper bounds of Musco and Musco '15. Moreover, it establishes a strict separation between convex optimization and \emph{randomized}, "strict-saddle" non-convex optimization of which PCA is a canonical example: in the former, first-order methods can have dimension-free iteration complexity, whereas in PCA, the iteration complexity of gradient-based methods must necessarily grow with the dimension.
View on arXiv