ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.15071
  4. Cited By
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to
  the Edge of Generalization

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

23 May 2024
Boshi Wang
Xiang Yue
Yu-Chuan Su
Huan Sun
    LRM
ArXivPDFHTML

Papers citing "Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization"

16 / 16 papers shown
Title
A Mathematical Philosophy of Explanations in Mechanistic Interpretability -- The Strange Science Part I.i
A Mathematical Philosophy of Explanations in Mechanistic Interpretability -- The Strange Science Part I.i
Kola Ayonrinde
Louis Jaburi
MILM
82
1
0
01 May 2025
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Roman Abramov
Felix Steinbauer
Gjergji Kasneci
69
0
0
29 Apr 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song
Zhuoyan Xu
Yiqiao Zhong
78
4
0
31 Dec 2024
MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models
MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models
Jiachun Li
Pengfei Cao
Zhuoran Jin
Yubo Chen
Kang-Jun Liu
Jun Zhao
LRM
ELM
32
4
0
12 Oct 2024
Understanding the Interplay between Parametric and Contextual Knowledge
  for Large Language Models
Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models
Sitao Cheng
Liangming Pan
Xunjian Yin
Xinyi Wang
William Yang Wang
KELM
37
3
0
10 Oct 2024
Language Models "Grok" to Copy
Language Models "Grok" to Copy
Ang Lv
Ruobing Xie
Xingwu Sun
Zhanhui Kang
Rui Yan
LLMAG
36
1
0
14 Sep 2024
Can Transformers Do Enumerative Geometry?
Can Transformers Do Enumerative Geometry?
Baran Hashemi
Roderic G. Corominas
Alessandro Giacchetto
32
2
0
27 Aug 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
73
18
0
02 Jul 2024
Unified View of Grokking, Double Descent and Emergent Abilities: A
  Perspective from Circuits Competition
Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition
Yufei Huang
Shengding Hu
Xu Han
Zhiyuan Liu
Maosong Sun
62
14
0
23 Feb 2024
Understanding Reasoning Ability of Language Models From the Perspective
  of Reasoning Paths Aggregation
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation
Xinyi Wang
Alfonso Amayuelas
Kexun Zhang
Liangming Pan
Wenhu Chen
W. Wang
LRM
32
11
0
05 Feb 2024
How does GPT-2 compute greater-than?: Interpreting mathematical
  abilities in a pre-trained language model
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model
Michael Hanna
Ollie Liu
Alexandre Variengien
LRM
184
116
0
30 Apr 2023
Interpretability in the Wild: a Circuit for Indirect Object
  Identification in GPT-2 small
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
210
491
0
01 Nov 2022
Recitation-Augmented Language Models
Recitation-Augmented Language Models
Zhiqing Sun
Xuezhi Wang
Yi Tay
Yiming Yang
Denny Zhou
RALM
192
60
0
04 Oct 2022
Omnigrok: Grokking Beyond Algorithmic Data
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
54
76
0
03 Oct 2022
Training Language Models with Memory Augmentation
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
232
127
0
25 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,402
0
28 Jan 2022
1