Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.04817
Cited By
The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon
10 June 2022
Vimal Thilak
Etai Littwin
Shuangfei Zhai
Omid Saremi
Roni Paiss
J. Susskind
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon"
7 / 7 papers shown
Title
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Roman Abramov
Felix Steinbauer
Gjergji Kasneci
72
0
0
29 Apr 2025
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation
Xinyu Zhou
Simin Fan
Martin Jaggi
Jie Fu
23
0
0
24 Apr 2025
Survival of the Fittest Representation: A Case Study with Modular Addition
Xiaoman Delores Ding
Zifan Carl Guo
Eric J. Michaud
Ziming Liu
Max Tegmark
34
3
0
27 May 2024
Grokking as Compression: A Nonlinear Complexity Perspective
Ziming Liu
Ziqian Zhong
Max Tegmark
30
9
0
09 Oct 2023
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
35
80
0
25 Sep 2023
Understanding Gradient Descent on Edge of Stability in Deep Learning
Sanjeev Arora
Zhiyuan Li
A. Panigrahi
MLT
75
88
0
19 May 2022
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
153
232
0
04 Mar 2020
1