Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.06173
Cited By
Unifying Grokking and Double Descent
10 March 2023
Peter W. Battaglia
David Raposo
Kelsey
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Unifying Grokking and Double Descent"
26 / 26 papers shown
Title
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Roman Abramov
Felix Steinbauer
Gjergji Kasneci
51
0
0
29 Apr 2025
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation
Xinyu Zhou
Simin Fan
Martin Jaggi
Jie Fu
18
0
0
24 Apr 2025
Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model
Zhiwei Xu
Zhiyu Ni
Yixin Wang
Wei Hu
CLL
32
0
0
17 Apr 2025
How more data can hurt: Instability and regularization in next-generation reservoir computing
Yuanzhao Zhang
Edmilson Roque dos Santos
Sean P. Cornelius
77
2
0
28 Jan 2025
Grokking at the Edge of Linear Separability
Alon Beck
Noam Levi
Yohai Bar-Sinai
24
0
0
06 Oct 2024
Approaching Deep Learning through the Spectral Dynamics of Weights
David Yunis
Kumar Kshitij Patel
Samuel Wheeler
Pedro H. P. Savarese
Gal Vardi
Karen Livescu
Michael Maire
Matthew R. Walter
34
3
0
21 Aug 2024
Emergence in non-neural models: grokking modular arithmetic via average gradient outer product
Neil Rohit Mallinar
Daniel Beaglehole
Libin Zhu
Adityanarayanan Radhakrishnan
Parthe Pandit
Misha Belkin
37
7
0
29 Jul 2024
One system for learning and remembering episodes and rules
Joshua T. S. Hewson
Sabina J. Sloman
Marina Dubova
CLL
20
0
0
08 Jul 2024
Grokking Modular Polynomials
Darshil Doshi
Tianyu He
Aritra Das
Andrey Gromov
29
4
0
05 Jun 2024
Grokfast: Accelerated Grokking by Amplifying Slow Gradients
Jaerin Lee
Bong Gyun Kang
Kihoon Kim
Kyoung Mu Lee
25
11
0
30 May 2024
Survival of the Fittest Representation: A Case Study with Modular Addition
Xiaoman Delores Ding
Zifan Carl Guo
Eric J. Michaud
Ziming Liu
Max Tegmark
29
3
0
27 May 2024
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Boshi Wang
Xiang Yue
Yu-Chuan Su
Huan Sun
LRM
16
41
0
23 May 2024
Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition
Yufei Huang
Shengding Hu
Xu Han
Zhiyuan Liu
Maosong Sun
62
14
0
23 Feb 2024
On Catastrophic Inheritance of Large Foundation Models
Hao Chen
Bhiksha Raj
Xing Xie
Jindong Wang
AI4CE
48
12
0
02 Feb 2024
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
Kaifeng Lyu
Jikai Jin
Zhiyuan Li
Simon S. Du
Jason D. Lee
Wei Hu
AI4CE
22
32
0
30 Nov 2023
Understanding Grokking Through A Robustness Viewpoint
Zhiquan Tan
Weiran Huang
AAML
OOD
25
6
0
11 Nov 2023
Bridging Lottery Ticket and Grokking: Understanding Grokking from Inner Structure of Networks
Gouki Minegishi
Yusuke Iwasawa
Yutaka Matsuo
11
3
0
30 Oct 2023
Grokking in Linear Estimators -- A Solvable Model that Groks without Understanding
Noam Levi
Alon Beck
Yohai Bar-Sinai
11
16
0
25 Oct 2023
Grokking as the Transition from Lazy to Rich Training Dynamics
Tanishq Kumar
Blake Bordelon
Samuel Gershman
C. Pehlevan
20
31
0
09 Oct 2023
Grokking as Compression: A Nonlinear Complexity Perspective
Ziming Liu
Ziqian Zhong
Max Tegmark
12
9
0
09 Oct 2023
Benign Overfitting and Grokking in ReLU Networks for XOR Cluster Data
Zhiwei Xu
Yutong Wang
Spencer Frei
Gal Vardi
Wei Hu
MLT
11
23
0
04 Oct 2023
Explaining grokking through circuit efficiency
Vikrant Varma
Rohin Shah
Zachary Kenton
János Kramár
Ramana Kumar
8
47
0
05 Sep 2023
The semantic landscape paradigm for neural networks
Shreyas Gokhale
13
2
0
18 Jul 2023
A Toy Model of Universality: Reverse Engineering How Networks Learn Group Operations
Bilal Chughtai
Lawrence Chan
Neel Nanda
10
96
0
06 Feb 2023
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
Shoaib Ahmed Siddiqui
Nitarshan Rajkumar
Tegan Maharaj
David M. Krueger
Sara Hooker
30
27
0
20 Sep 2022
Multi-scale Feature Learning Dynamics: Insights for Double Descent
Mohammad Pezeshki
Amartya Mitra
Yoshua Bengio
Guillaume Lajoie
45
25
0
06 Dec 2021
1