ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.10287
  4. Cited By
On the SDEs and Scaling Rules for Adaptive Gradient Algorithms

On the SDEs and Scaling Rules for Adaptive Gradient Algorithms

20 May 2022
Sadhika Malladi
Kaifeng Lyu
A. Panigrahi
Sanjeev Arora
ArXivPDFHTML

Papers citing "On the SDEs and Scaling Rules for Adaptive Gradient Algorithms"

5 / 5 papers shown
Title
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Oleg Filatov
Jan Ebert
Jiangtao Wang
Stefan Kesselheim
24
3
0
10 Jan 2025
AdaFisher: Adaptive Second Order Optimization via Fisher Information
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
51
2
0
26 May 2024
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning
  Dynamics
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics
D. Kunin
Javier Sagastuy-Breña
Surya Ganguli
Daniel L. K. Yamins
Hidenori Tanaka
97
77
0
08 Dec 2020
Quasi-hyperbolic momentum and Adam for deep learning
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
73
126
0
16 Oct 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,003
0
20 Apr 2018
1