Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.05293
Cited By
Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks
8 March 2024
Hristo Papazov
Scott Pesme
Nicolas Flammarion
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks"
5 / 5 papers shown
Title
Optimization Insights into Deep Diagonal Linear Networks
Hippolyte Labarrière
C. Molinari
Lorenzo Rosasco
S. Villa
Cristian Vega
66
0
0
21 Dec 2024
The AdEMAMix Optimizer: Better, Faster, Older
Matteo Pagliardini
Pierre Ablin
David Grangier
ODL
28
8
0
05 Sep 2024
Implicit Bias of Mirror Flow on Separable Data
Scott Pesme
Radu-Alexandru Dragomir
Nicolas Flammarion
29
1
0
18 Jun 2024
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults
Prin Phunyaphibarn
Junghyun Lee
Bohan Wang
Huishuai Zhang
Chulhee Yun
13
0
0
25 Nov 2023
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
Weijie Su
Stephen P. Boyd
Emmanuel J. Candes
97
1,150
0
04 Mar 2015
1