The Reflectron: Exploiting geometry for learning generalized linear models

15 June 2020

Stephen Tu

Abstract

We present the Reflectron, a family of pseudogradient methods for learning generalized linear models inspired by mirror descent. Despite nonconvexity of the underlying optimization problem, we prove that the Reflectron is both statistically and computationally efficient. By analogy to standard mirror descent, we show that the methods can be tailored to the $\textit{problem geometry}$ through choice of a potential function that defines the $\textit{optimization geometry}$ . We provide guarantees in both the stochastic and full-batch settings, and our analysis recovers gradient descent and the GLM-tron of Kakade et al. (2011) as special cases. Via a natural continuous-time limit, we provide simple and intuitive derivations of the statistical, convergence, and implicit bias properties of the algorithms. We subsequently discretize the flow to arrive at an iteration with matching guarantees. Experimentally, the extra flexibility afforded by the Reflectron allows it to outperform the GLM-tron on sparse vector and low-rank matrix recovery problems.

View on arXiv

Comments on this paper