147

Coordinate-wise Armijo's condition

Abstract

Let z=(x,y)z=(x,y) be coordinates for the product space Rm1×Rm2\mathbb{R}^{m_1}\times \mathbb{R}^{m_2}. Let f:Rm1×Rm2Rf:\mathbb{R}^{m_1}\times \mathbb{R}^{m_2}\rightarrow \mathbb{R} be a C1C^1 function, and f=(xf,yf)\nabla f=(\partial _xf,\partial _yf) its gradient. Fix 0<α<10<\alpha <1. For a point (x,y)Rm1×Rm2(x,y) \in \mathbb{R}^{m_1}\times \mathbb{R}^{m_2}, a number δ>0\delta >0 satisfies Armijo's condition at (x,y)(x,y) if the following inequality holds: \begin{eqnarray*} f(x-\delta \partial _xf,y-\delta \partial _yf)-f(x,y)\leq -\alpha \delta (||\partial _xf||^2+||\partial _yf||^2). \end{eqnarray*} When f(x,y)=f1(x)+f2(y)f(x,y)=f_1(x)+f_2(y) is a coordinate-wise sum map, we propose the following {\bf coordinate-wise} Armijo's condition. Fix again 0<α<10<\alpha <1. A pair of positive numbers δ1,δ2>0\delta _1,\delta _2>0 satisfies the coordinate-wise variant of Armijo's condition at (x,y)(x,y) if the following inequality holds: \begin{eqnarray*} [f_1(x-\delta _1\nabla f_1(x))+f_2(y-\delta _2\nabla f_2(y))]-[f_1(x)+f_2(y)]\leq -\alpha (\delta _1||\nabla f_1(x)||^2+\delta _2||\nabla f_2(y)||^2). \end{eqnarray*} We then extend results in our recent previous results, on Backtracking Gradient Descent and some variants, to this setting. We show by an example the advantage of using coordinate-wise Armijo's condition over the usual Armijo's condition.

View on arXiv
Comments on this paper