ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.12807
  4. Cited By
FAdam: Adam is a natural gradient optimizer using diagonal empirical
  Fisher information

FAdam: Adam is a natural gradient optimizer using diagonal empirical Fisher information

21 May 2024
Dongseong Hwang
    ODL
ArXivPDFHTML

Papers citing "FAdam: Adam is a natural gradient optimizer using diagonal empirical Fisher information"

8 / 8 papers shown
Title
Fine-Tuning TransMorph with Gradient Correlation for Anatomical Alignment
Fine-Tuning TransMorph with Gradient Correlation for Anatomical Alignment
Lukas Förner
Kartikay Tehlan
Thomas Wendler
MedIm
24
0
0
31 Dec 2024
High-precision medical speech recognition through synthetic data and
  semantic correction: UNITED-MEDASR
High-precision medical speech recognition through synthetic data and semantic correction: UNITED-MEDASR
Sourav Banerjee
Ayushi Agarwal
Promila Ghosh
74
2
0
24 Nov 2024
Can We Remove the Square-Root in Adaptive Gradient Methods? A
  Second-Order Perspective
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
Wu Lin
Felix Dangel
Runa Eschenhagen
Juhan Bae
Richard E. Turner
Alireza Makhzani
ODL
44
12
0
05 Feb 2024
Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on
  Transformers, but Sign Descent Might Be
Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be
Frederik Kunstner
Jacques Chen
J. Lavington
Mark W. Schmidt
38
66
0
27 Apr 2023
MetNet: A Neural Weather Model for Precipitation Forecasting
MetNet: A Neural Weather Model for Precipitation Forecasting
C. Sønderby
L. Espeholt
Jonathan Heek
Mostafa Dehghani
Avital Oliver
Tim Salimans
Shreya Agrawal
Jason Hickey
Nal Kalchbrenner
AI4Cl
212
268
0
24 Mar 2020
A Simple Convergence Proof of Adam and Adagrad
A Simple Convergence Proof of Adam and Adagrad
Alexandre Défossez
Léon Bottou
Francis R. Bach
Nicolas Usunier
56
143
0
05 Mar 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
223
4,424
0
23 Jan 2020
MCMC using Hamiltonian dynamics
MCMC using Hamiltonian dynamics
Radford M. Neal
130
3,260
0
09 Jun 2012
1