v1v2 (latest)

Spurious Stationarity and Hardness Results for Bregman Proximal-Type Algorithms

11 April 2024

Main:18 Pages

2 Figures

Bibliography:4 Pages

Appendix:7 Pages

Abstract

Bregman proximal-type algorithms (BPs), such as mirror descent, have become popular tools in machine learning and data science for exploiting problem structures through non-Euclidean geometries. In this paper, we show that BPs can get trapped near a class of non-stationary points, which we term spurious stationary points. Such stagnation can persist for any finite number of iterations if the gradient of the Bregman kernel is not Lipschitz continuous, even in convex problems. The root cause lies in a fundamental contrast in descent behavior between Euclidean and Bregman geometries: While Euclidean gradient descent ensures sufficient decrease near any non-stationary point, BPs may exhibit arbitrarily slow decrease around spurious stationary points. As a result, commonly used Bregman-based stationarity measure, such as relative change in terms of Bregman divergence, can vanish near spurious stationary points. This may misleadingly suggest convergence, even when the iterates remain far from any true stationary point. Our analysis further reveals that spurious stationary points are not pathological, but rather occur generically in a broad class of nonconvex problems with polyhedral constraints. Taken together, our findings reveal a serious blind spot in Bregman-based optimization methods and calls for new theoretical tools and algorithmic safeguards to ensure reliable convergence.

View on arXiv

Comments on this paper