15
11

Stochastic Subgradient Descent Escapes Active Strict Saddles

Abstract

In non-smooth stochastic optimization, we establish the non-convergence of the stochastic subgradient descent (SGD) to the critical points recently called active strict saddles by Davis and Drusvyatskiy. Such points lie on a manifold MM where the function ff has a direction of second-order negative curvature. Off this manifold, the norm of the Clarke subdifferential of ff is lower-bounded. We require two conditions on ff. The first assumption is a Verdier stratification condition, which is a refinement of the popular Whitney stratification. It allows us to establish a reinforced version of the projection formula of Bolte \emph{et.al.} for Whitney stratifiable functions, and which is of independent interest. The second assumption, termed the angle condition, allows to control the distance of the iterates to MM. When ff is weakly convex, our assumptions are generic. Consequently, generically in the class of definable weakly convex functions, the SGD converges to a local minimizer.

View on arXiv
Comments on this paper