247
v1v2v3v4v5 (latest)

The local convexity of solving systems of quadratic equations

Abstract

This paper considers the recovery of a rank rr positive semidefinite matrix XXTRn×nX X^T\in\mathbb{R}^{n\times n} from mm scalar measurements of the form yi:=aiTXXTaiy_i := a_i^T X X^T a_i (i.e., quadratic measurements of XX). Such problems arise in a variety of applications, including covariance sketching of high-dimensional data streams, quadratic regression, quantum state tomography, among others. A natural approach to this problem is to minimize the loss function f(U)=i(yiaiTUUTai)2f(U) = \sum_i (y_i - a_i^TUU^Ta_i)^2 which has an entire manifold of solutions given by {XO}OOr\{XO\}_{O\in\mathcal{O}_r} where Or\mathcal{O}_r is the orthogonal group of r×rr\times r orthogonal matrices; this is {\it non-convex} in the n×rn\times r matrix UU, but methods like gradient descent are simple and easy to implement (as compared to semidefinite relaxation approaches). In this paper we show that once we have mCnrlog2(n)m \geq C nr \log^2(n) samples from isotropic gaussian aia_i, with high probability {\em (a)} this function admits a dimension-independent region of {\em local strong convexity} on lines perpendicular to the solution manifold, and {\em (b)} with an additional polynomial factor of rr samples, a simple spectral initialization will land within the region of convexity with high probability. Together, this implies that gradient descent with initialization (but no re-sampling) will converge linearly to the correct XX, up to an orthogonal transformation. We believe that this general technique (local convexity reachable by spectral initialization) should prove applicable to a broader class of nonconvex optimization problems.

View on arXiv
Comments on this paper