On the Randomized Complexity of Minimizing a Convex Quadratic Function

24 July 2018

Max Simchowitz

Abstract

Minimizing a convex, quadratic objective of the form $f_{\mathbf{A},\mathbf{b}}(x) := \frac{1}{2}x^\top \mathbf{A} x - \langle \mathbf{b}, x \rangle$ for $\mathbf{A} \succ 0$ is a fundamental problem in machine learning and optimization. In this work, we prove gradient-query complexity lower bounds for minimizing convex quadratic functions which apply to both deterministic and \emph{randomized} algorithms. Specifically, for $\kappa > 1$ , we exhibit a distribution over $(\mathbf{A},\mathbf{b})$ with condition number $\mathrm{cond}(\mathbf{A}) \le \kappa$ , such that any \emph{randomized} algorithm requires $\Omega(\sqrt{\kappa})$ gradient queries to find a solution $\hat x$ for which $\|\hat x - \mathbf x_\star\| \le \epsilon_0\|\mathbf{x}_{\star}\|$ , where $\mathbf x_{\star} = \mathbf{A}^{-1}\mathbf{b}$ is the optimal solution, and $\epsilon_0$ a small constant. Setting $\kappa =1/\epsilon$ , this lower bound implies the minimax rate of $T = \Omega(\lambda_1(\mathbf{A})\|\mathbf x_\star\|^2/\sqrt{\epsilon})$ queries required to minimize an arbitrary convex quadratic function up to error $f(\hat{x}) - f(\mathbf x_\star) \le \epsilon$ . Our lower bound holds for a distribution derived from classical ensembles in random matrix theory, and relies on a careful reduction from adaptively estimating a planted vector $\mathbf u$ in a deformed Wigner model. A key step in deriving sharp lower bounds is demonstrating that the optimization error $\mathbf x_\star - \hat x$ cannot align too closely with $\mathbf{u}$ . To this end, we prove an upper bound on the cosine between $\mathbf x_\star - \hat x$ and $\mathbf u$ in terms of the MMSE of estimating the plant $\mathbf u$ in a deformed Wigner model. We then bound the MMSE by carefully modifying a result due to Lelarge and Miolane 2016, which rigorously establishes a general replica-symmetric formula for planted matrix models.

View on arXiv

Comments on this paper