Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.10287
Cited By
On the SDEs and Scaling Rules for Adaptive Gradient Algorithms
20 May 2022
Sadhika Malladi
Kaifeng Lyu
A. Panigrahi
Sanjeev Arora
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the SDEs and Scaling Rules for Adaptive Gradient Algorithms"
5 / 5 papers shown
Title
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Oleg Filatov
Jan Ebert
Jiangtao Wang
Stefan Kesselheim
24
3
0
10 Jan 2025
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
51
2
0
26 May 2024
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics
D. Kunin
Javier Sagastuy-Breña
Surya Ganguli
Daniel L. K. Yamins
Hidenori Tanaka
97
77
0
08 Dec 2020
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
73
126
0
16 Oct 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,003
0
20 Apr 2018
1