21
0

Stochastic Gradient Descent in Non-Convex Problems: Asymptotic Convergence with Relaxed Step-Size via Stopping Time Methods

Abstract

Stochastic Gradient Descent (SGD) is widely used in machine learning research. Previous convergence analyses of SGD under the vanishing step-size setting typically require Robbins-Monro conditions. However, in practice, a wider variety of step-size schemes are frequently employed, yet existing convergence results remain limited and often rely on strong assumptions. This paper bridges this gap by introducing a novel analytical framework based on a stopping-time method, enabling asymptotic convergence analysis of SGD under more relaxed step-size conditions and weaker assumptions. In the non-convex setting, we prove the almost sure convergence of SGD iterates for step-sizes {ϵt}t1 \{ \epsilon_t \}_{t \geq 1} satisfying t=1+ϵt=+\sum_{t=1}^{+\infty} \epsilon_t = +\infty and t=1+ϵtp<+\sum_{t=1}^{+\infty} \epsilon_t^p < +\infty for some p>2p > 2. Compared with previous studies, our analysis eliminates the global Lipschitz continuity assumption on the loss function and relaxes the boundedness requirements for higher-order moments of stochastic gradients. Building upon the almost sure convergence results, we further establish L2L_2 convergence. These significantly relaxed assumptions make our theoretical results more general, thereby enhancing their applicability in practical scenarios.

View on arXiv
@article{jin2025_2504.12601,
  title={ Stochastic Gradient Descent in Non-Convex Problems: Asymptotic Convergence with Relaxed Step-Size via Stopping Time Methods },
  author={ Ruinan Jin and Difei Cheng and Hong Qiao and Xin Shi and Shaodong Liu and Bo Zhang },
  journal={arXiv preprint arXiv:2504.12601},
  year={ 2025 }
}
Comments on this paper