17
4

Memory-Constrained Algorithms for Convex Optimization via Recursive Cutting-Planes

Abstract

We propose a family of recursive cutting-plane algorithms to solve feasibility problems with constrained memory, which can also be used for first-order convex optimization. Precisely, in order to find a point within a ball of radius ϵ\epsilon with a separation oracle in dimension dd -- or to minimize 11-Lipschitz convex functions to accuracy ϵ\epsilon over the unit ball -- our algorithms use O(d2pln1ϵ)\mathcal O(\frac{d^2}{p}\ln \frac{1}{\epsilon}) bits of memory, and make O((Cdpln1ϵ)p)\mathcal O((C\frac{d}{p}\ln \frac{1}{\epsilon})^p) oracle calls, for some universal constant C1C \geq 1. The family is parametrized by p[d]p\in[d] and provides an oracle-complexity/memory trade-off in the sub-polynomial regime ln1ϵlnd\ln\frac{1}{\epsilon}\gg\ln d. While several works gave lower-bound trade-offs (impossibility results) -- we explicit here their dependence with ln1ϵ\ln\frac{1}{\epsilon}, showing that these also hold in any sub-polynomial regime -- to the best of our knowledge this is the first class of algorithms that provides a positive trade-off between gradient descent and cutting-plane methods in any regime with ϵ1/d\epsilon\leq 1/\sqrt d. The algorithms divide the dd variables into pp blocks and optimize over blocks sequentially, with approximate separation vectors constructed using a variant of Vaidya's method. In the regime ϵdΩ(d)\epsilon \leq d^{-\Omega(d)}, our algorithm with p=dp=d achieves the information-theoretic optimal memory usage and improves the oracle-complexity of gradient descent.

View on arXiv
Comments on this paper