Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.20233
Cited By
Grokfast: Accelerated Grokking by Amplifying Slow Gradients
30 May 2024
Jaerin Lee
Bong Gyun Kang
Kihoon Kim
Kyoung Mu Lee
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Grokfast: Accelerated Grokking by Amplifying Slow Gradients"
6 / 6 papers shown
Title
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Roman Abramov
Felix Steinbauer
Gjergji Kasneci
54
0
0
29 Apr 2025
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation
Xinyu Zhou
Simin Fan
Martin Jaggi
Jie Fu
18
0
0
24 Apr 2025
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
47
13
0
24 Jun 2024
Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition
Yufei Huang
Shengding Hu
Xu Han
Zhiyuan Liu
Maosong Sun
62
14
0
23 Feb 2024
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
54
76
0
03 Oct 2022
Multi-scale Feature Learning Dynamics: Insights for Double Descent
Mohammad Pezeshki
Amartya Mitra
Yoshua Bengio
Guillaume Lajoie
45
25
0
06 Dec 2021
1