From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent

13 October 2022

Papers citing "From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent"

7 / 7 papers shown

Title
Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning? Khashayar Gatmiry Nikunj Saunshi Sashank J. Reddi Stefanie Jegelka Sanjiv Kumar 67 17 0 10 Oct 2024
Dr. FERMI: A Stochastic Distributionally Robust Fair Empirical Risk Minimization Framework Sina Baharlouei Meisam Razaviyayn FaML OOD 30 0 0 20 Sep 2023
Convergence of stochastic gradient descent under a local Lojasiewicz condition for deep neural networks Jing An Jianfeng Lu 11 4 0 18 Apr 2023
Continuous vs. Discrete Optimization of Deep Neural Networks Omer Elkabetz Nadav Cohen 58 44 0 14 Jul 2021
SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs Satyen Kale Ayush Sekhari Karthik Sridharan 173 28 0 11 Jul 2021
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition Hamed Karimi J. Nutini Mark W. Schmidt 119 1,190 0 16 Aug 2016
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights Weijie Su Stephen P. Boyd Emmanuel J. Candes 97 1,150 0 04 Mar 2015