IQN: An Incremental Quasi-Newton Method with Local Superlinear Convergence Rate

SIAM Journal on Optimization (SIAM J. Optim.), 2017

2 February 2017

Alejandro Ribeiro

Abstract

This paper studies the problem of minimizing a global objective function which can be written as the average of a set of $n$ smooth and strongly convex functions. Quasi-Newton methods, which build on the idea of approximating the Newton step using the first-order information of the objective function, are successful in reducing the computational complexity of Newton's method by avoiding the Hessian and its inverse computation at each iteration, while converging at a superlinear rate to the optimal argument. However, quasi-Newton methods are impractical for solving the finite sum minimization problem since they operate on the information of all $n$ functions at each iteration. This issue has been addressed by incremental quasi-Newton methods which use the information of a subset of functions at each iteration. Although incremental quasi-Newton methods are able to reduce the computational complexity of traditional quasi-Newton methods significantly, they fail to converge at a superlinear rate. In this paper, we propose the IQN method as the first incremental quasi-Newton method with a local superlinear convergence rate. In IQN, we compute and update the information of only a single function at each iteration and use the gradient information to approximate the Newton direction without a computationally expensive inversion. IQN differs from state-of-the-art incremental quasi-Newton methods in three criteria. First, the use of aggregated information of variables, gradients, and quasi-Newton Hessian approximations; second, the approximation of each individual function by its Taylor's expansion in which the linear and quadratic terms are evaluated with respect to the same iterate; and third, the use of a cyclic scheme to update the functions in lieu of a random selection routine. We use these fundamental properties of IQN to establish its local superlinear convergence rate.

View on arXiv

Comments on this paper