ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.20233
  4. Cited By
Grokfast: Accelerated Grokking by Amplifying Slow Gradients

Grokfast: Accelerated Grokking by Amplifying Slow Gradients

30 May 2024
Jaerin Lee
Bong Gyun Kang
Kihoon Kim
Kyoung Mu Lee
ArXivPDFHTML

Papers citing "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"

6 / 6 papers shown
Title
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Roman Abramov
Felix Steinbauer
Gjergji Kasneci
54
0
0
29 Apr 2025
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation
Xinyu Zhou
Simin Fan
Martin Jaggi
Jie Fu
18
0
0
24 Apr 2025
WARP: On the Benefits of Weight Averaged Rewarded Policies
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
47
13
0
24 Jun 2024
Unified View of Grokking, Double Descent and Emergent Abilities: A
  Perspective from Circuits Competition
Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition
Yufei Huang
Shengding Hu
Xu Han
Zhiyuan Liu
Maosong Sun
62
14
0
23 Feb 2024
Omnigrok: Grokking Beyond Algorithmic Data
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
54
76
0
03 Oct 2022
Multi-scale Feature Learning Dynamics: Insights for Double Descent
Multi-scale Feature Learning Dynamics: Insights for Double Descent
Mohammad Pezeshki
Amartya Mitra
Yoshua Bengio
Guillaume Lajoie
45
25
0
06 Dec 2021
1