Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.08624
Cited By
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
12 April 2024
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks"
4 / 4 papers shown
Title
FAdam: Adam is a natural gradient optimizer using diagonal empirical Fisher information
Dongseong Hwang
ODL
19
4
0
21 May 2024
On the One-sided Convergence of Adam-type Algorithms in Non-convex Non-concave Min-max Optimization
Zehao Dou
Yuanzhi Li
23
13
0
29 Sep 2021
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
220
450
0
11 Feb 2021
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
99
570
0
08 Dec 2012
1