101
v1v2 (latest)

Some convergent results for Backtracking Gradient Descent method on Banach spaces

Abstract

Our main result concerns the following condition: {\bf Condition C.} Let XX be a Banach space. A C1C^1 function f:XRf:X\rightarrow \mathbb{R} satisfies Condition C if whenever {xn}\{x_n\} weakly converges to xx and limnf(xn)=0\lim _{n\rightarrow\infty}||\nabla f(x_n)||=0, then f(x)=0\nabla f(x)=0. We assume that there is given a canonical isomorphism between XX and its dual XX^*, for example when XX is a Hilbert space. {\bf Theorem.} Let XX be a reflexive, complete Banach space and f:XRf:X\rightarrow \mathbb{R} be a C2C^2 function which satisfies Condition C. Moreover, we assume that for every bounded set SXS\subset X, then supxS2f(x)<\sup _{x\in S}||\nabla ^2f(x)||<\infty. We choose a random point x0Xx_0\in X and construct by the Local Backtracking GD procedure (which depends on 33 hyper-parameters α,β,δ0\alpha ,\beta ,\delta _0, see later for details) the sequence xn+1=xnδ(xn)f(xn)x_{n+1}=x_n-\delta (x_n)\nabla f(x_n). Then we have: 1) Every cluster point of {xn}\{x_n\}, in the {\bf weak} topology, is a critical point of ff. 2) Either limnf(xn)=\lim _{n\rightarrow\infty}f(x_n)=-\infty or limnxn+1xn=0\lim _{n\rightarrow\infty}||x_{n+1}-x_n||=0. 3) Here we work with the weak topology. Let C\mathcal{C} be the set of critical points of ff. Assume that C\mathcal{C} has a bounded component AA. Let B\mathcal{B} be the set of cluster points of {xn}\{x_n\}. If BA\mathcal{B}\cap A\not= \emptyset, then BA\mathcal{B}\subset A and B\mathcal{B} is connected. 4) Assume that XX is separable. Then for generic choices of α,β,δ0\alpha ,\beta ,\delta _0 and the initial point x0x_0, if the sequence {xn}\{x_n\} converges - in the {\bf weak} topology, then the limit point cannot be a saddle point.

View on arXiv
Comments on this paper