Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to
the Edge of Generalization

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization

23 May 2024

Papers citing "Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization"

16 / 16 papers shown

Title
A Mathematical Philosophy of Explanations in Mechanistic Interpretability -- The Strange Science Part I.i Kola Ayonrinde Louis Jaburi MILM 79 1 0 01 May 2025
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers Roman Abramov Felix Steinbauer Gjergji Kasneci 60 0 0 29 Apr 2025
Out-of-distribution generalization via composition: a lens through induction heads in Transformers Jiajun Song Zhuoyan Xu Yiqiao Zhong 78 4 0 31 Dec 2024
MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models Jiachun Li Pengfei Cao Zhuoran Jin Yubo Chen Kang-Jun Liu Jun Zhao LRM ELM 32 4 0 12 Oct 2024
Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models Sitao Cheng Liangming Pan Xunjian Yin Xinyi Wang William Yang Wang KELM 35 3 0 10 Oct 2024
Language Models "Grok" to Copy Ang Lv Ruobing Xie Xingwu Sun Zhanhui Kang Rui Yan LLMAG 36 1 0 14 Sep 2024
Can Transformers Do Enumerative Geometry? Baran Hashemi Roderic G. Corominas Alessandro Giacchetto 32 2 0 27 Aug 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models Daking Rai Yilun Zhou Shi Feng Abulhair Saparov Ziyu Yao 73 18 0 02 Jul 2024
Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition Yufei Huang Shengding Hu Xu Han Zhiyuan Liu Maosong Sun 62 14 0 23 Feb 2024
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation Xinyi Wang Alfonso Amayuelas Kexun Zhang Liangming Pan Wenhu Chen W. Wang LRM 32 11 0 05 Feb 2024
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model Michael Hanna Ollie Liu Alexandre Variengien LRM 184 116 0 30 Apr 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small Kevin Wang Alexandre Variengien Arthur Conmy Buck Shlegeris Jacob Steinhardt 210 486 0 01 Nov 2022
Recitation-Augmented Language Models Zhiqing Sun Xuezhi Wang Yi Tay Yiming Yang Denny Zhou RALM 192 60 0 04 Oct 2022
Omnigrok: Grokking Beyond Algorithmic Data Ziming Liu Eric J. Michaud Max Tegmark 54 76 0 03 Oct 2022
Training Language Models with Memory Augmentation Zexuan Zhong Tao Lei Danqi Chen RALM 232 126 0 25 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 315 8,261 0 28 Jan 2022