DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method

25 May 2023

Papers citing "DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method"

7 / 7 papers shown

Title
A simple uniformly optimal method without line search for convex optimization Tianjiao Li Guanghui Lan 19 20 0 16 Oct 2023
Adaptive Proximal Gradient Method for Convex Optimization Yura Malitsky Konstantin Mishchenko 16 21 0 04 Aug 2023
Convergence of Adam Under Relaxed Assumptions Haochuan Li Alexander Rakhlin Ali Jadbabaie 29 53 0 27 Apr 2023
Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be Frederik Kunstner Jacques Chen J. Lavington Mark W. Schmidt 40 66 0 27 Apr 2023
$On the Convergence of AdaGrad(Norm) on $\R^{d}$: Beyond Convexity, Non-Asymptotic Rate and Acceleration$ On the Convergence of AdaGrad(Norm) on $\R^{d}$ : Beyond Convexity, Non-Asymptotic Rate and Acceleration Zijian Liu Ta Duy Nguyen Alina Ene Huy Le Nguyen 31 7 0 29 Sep 2022
Understanding Gradient Descent on Edge of Stability in Deep Learning Sanjeev Arora Zhiyuan Li A. Panigrahi MLT 75 88 0 19 May 2022
Carbon Emissions and Large Neural Network Training David A. Patterson Joseph E. Gonzalez Quoc V. Le Chen Liang Lluís-Miquel Munguía D. Rothchild David R. So Maud Texier J. Dean AI4CE 239 642 0 21 Apr 2021