Beyond Uniform Smoothness: A Stopped Analysis of Adaptive SGD

13 February 2023

Papers citing "Beyond Uniform Smoothness: A Stopped Analysis of Adaptive SGD"

8 / 8 papers shown

Title
AdamS: Momentum Itself Can Be A Normalizer for LLM Pretraining and Post-training Huishuai Zhang Bohan Wang Luoxin Chen ODL 156 0 0 22 May 2025
Understanding Gradient Orthogonalization for Deep Learning via Non-Euclidean Trust-Region Optimization Dmitry Kovalev 93 3 0 16 Mar 2025
An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness Xiaochuan Gong Jie Hao Mingrui Liu 91 2 0 28 Sep 2024
Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance Qi Zhang Yi Zhou Shaofeng Zou 82 5 0 01 Apr 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions Yusu Hong Junhong Lin 76 13 0 06 Feb 2024
High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize Ali Kavis Kfir Y. Levy Volkan Cevher 35 41 0 06 Apr 2022
A Simple Convergence Proof of Adam and Adagrad Alexandre Défossez Léon Bottou Francis R. Bach Nicolas Usunier 92 150 0 05 Mar 2020
Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming Saeed Ghadimi Guanghui Lan ODL 79 1,538 0 22 Sep 2013