Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.10809
Cited By
Neglected Hessian component explains mysteries in Sharpness regularization
19 January 2024
Yann N. Dauphin
Atish Agarwala
Hossein Mobahi
FAtt
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Neglected Hessian component explains mysteries in Sharpness regularization"
6 / 6 papers shown
Title
LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection
Xinyue Zeng
Haohui Wang
Junhong Lin
Jun Wu
Tyler Cody
Dawei Zhou
46
0
0
01 May 2025
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
Weronika Ormaniec
Felix Dangel
Sidak Pal Singh
29
6
0
14 Oct 2024
High dimensional analysis reveals conservative sharpening and a stochastic edge of stability
Atish Agarwala
Jeffrey Pennington
35
3
0
30 Apr 2024
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
39
1
0
29 Nov 2023
The Hessian perspective into the Nature of Convolutional Neural Networks
Sidak Pal Singh
Thomas Hofmann
Bernhard Schölkopf
25
10
0
16 May 2023
Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping
James Martens
Andy Ballard
Guillaume Desjardins
G. Swirszcz
Valentin Dalibard
Jascha Narain Sohl-Dickstein
S. Schoenholz
83
43
0
05 Oct 2021
1