ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.05293
  4. Cited By
Leveraging Continuous Time to Understand Momentum When Training Diagonal
  Linear Networks

Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks

8 March 2024
Hristo Papazov
Scott Pesme
Nicolas Flammarion
ArXivPDFHTML

Papers citing "Leveraging Continuous Time to Understand Momentum When Training Diagonal Linear Networks"

5 / 5 papers shown
Title
Optimization Insights into Deep Diagonal Linear Networks
Optimization Insights into Deep Diagonal Linear Networks
Hippolyte Labarrière
C. Molinari
Lorenzo Rosasco
S. Villa
Cristian Vega
66
0
0
21 Dec 2024
The AdEMAMix Optimizer: Better, Faster, Older
The AdEMAMix Optimizer: Better, Faster, Older
Matteo Pagliardini
Pierre Ablin
David Grangier
ODL
28
8
0
05 Sep 2024
Implicit Bias of Mirror Flow on Separable Data
Implicit Bias of Mirror Flow on Separable Data
Scott Pesme
Radu-Alexandru Dragomir
Nicolas Flammarion
27
1
0
18 Jun 2024
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large
  Catapults
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults
Prin Phunyaphibarn
Junghyun Lee
Bohan Wang
Huishuai Zhang
Chulhee Yun
10
0
0
25 Nov 2023
A Differential Equation for Modeling Nesterov's Accelerated Gradient
  Method: Theory and Insights
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights
Weijie Su
Stephen P. Boyd
Emmanuel J. Candes
97
1,150
0
04 Mar 2015
1