We show that any randomized first-order algorithm which minimizes a -dimensional, -Lipschitz convex function over the unit ball must either use bits of memory or make queries, for any constant and when the precision is quasipolynomially small in . Our result implies that cutting plane methods, which use bits of memory and queries, are Pareto-optimal among randomized first-order algorithms, and quadratic memory is required to achieve optimal query complexity for convex optimization.
View on arXiv