Gradient Descent is Pareto-Optimal in the Oracle Complexity and Memory Tradeoff for Feasibility Problems

In this paper we provide oracle complexity lower bounds for finding a point in a given set using a memory-constrained algorithm that has access to a separation oracle. We assume that the set is contained within the unit -dimensional ball and contains a ball of known radius . This setup is commonly referred to as the feasibility problem. We show that to solve feasibility problems with accuracy , any deterministic algorithm either uses bits of memory or must make at least oracle queries, for any . Additionally, we show that randomized algorithms either use memory or make at least queries for any . Because gradient descent only uses linear memory but makes queries, our results imply that it is Pareto-optimal in the oracle complexity/memory tradeoff. Further, our results show that the oracle complexity for deterministic algorithms is always polynomial in if the algorithm has less than quadratic memory in . This reveals a sharp phase transition since with quadratic memory, cutting plane methods only require queries.
View on arXiv