ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.20076
  4. Cited By
Grokking ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior

Grokking ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior

26 May 2025
Florian Eichin
Yupei Du
Philipp Mondorf
Barbara Plank
Michael A. Hedderich
    FAtt
ArXivPDFHTML

Papers citing "Grokking ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior"

16 / 16 papers shown
Title
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens
Jiacheng Liu
Taylor Blanton
Yanai Elazar
Sewon Min
YenSung Chen
...
Sophie Lebrecht
Yejin Choi
Hannaneh Hajishirzi
Ali Farhadi
Jesse Dodge
56
2
0
09 Apr 2025
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
121
30
0
02 Jul 2024
Critical Data Size of Language Models from a Grokking Perspective
Critical Data Size of Language Models from a Grokking Perspective
Xuekai Zhu
Yao Fu
Bowen Zhou
Zhouhan Lin
34
16
0
19 Jan 2024
An Exact Kernel Equivalence for Finite Classification Models
An Exact Kernel Equivalence for Finite Classification Models
Brian Bell
Michaela Geyer
David Glickenstein
Amanda Fernandez
Juston Moore
52
3
0
01 Aug 2023
Word Embeddings Are Steers for Language Models
Word Embeddings Are Steers for Language Models
Chi Han
Jialiang Xu
Manling Li
Yi R. Fung
Chenkai Sun
Nan Jiang
Tarek Abdelzaher
Heng Ji
LLMSV
59
36
0
22 May 2023
Omnigrok: Grokking Beyond Algorithmic Data
Omnigrok: Grokking Beyond Algorithmic Data
Ziming Liu
Eric J. Michaud
Max Tegmark
68
81
0
03 Oct 2022
Toy Models of Superposition
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
148
351
0
21 Sep 2022
Every Model Learned by Gradient Descent Is Approximately a Kernel
  Machine
Every Model Learned by Gradient Descent Is Approximately a Kernel Machine
Pedro M. Domingos
MLT
54
72
0
30 Nov 2020
Attention is not not Explanation
Attention is not not Explanation
Sarah Wiegreffe
Yuval Pinter
XAI
AAML
FAtt
43
901
0
13 Aug 2019
BERT Rediscovers the Classical NLP Pipeline
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILM
SSeg
100
1,458
0
15 May 2019
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
427
129,831
0
12 Jun 2017
Understanding Black-box Predictions via Influence Functions
Understanding Black-box Predictions via Influence Functions
Pang Wei Koh
Percy Liang
TDI
134
2,854
0
14 Mar 2017
Pruning Filters for Efficient ConvNets
Pruning Filters for Efficient ConvNets
Hao Li
Asim Kadav
Igor Durdanovic
H. Samet
H. Graf
3DPC
155
3,676
0
31 Aug 2016
The Mythos of Model Interpretability
The Mythos of Model Interpretability
Zachary Chase Lipton
FaML
110
3,672
0
10 Jun 2016
Layer-wise Relevance Propagation for Neural Networks with Local
  Renormalization Layers
Layer-wise Relevance Propagation for Neural Networks with Local Renormalization Layers
Alexander Binder
G. Montavon
Sebastian Lapuschkin
K. Müller
Wojciech Samek
FAtt
54
456
0
04 Apr 2016
Visualizing and Understanding Convolutional Networks
Visualizing and Understanding Convolutional Networks
Matthew D. Zeiler
Rob Fergus
FAtt
SSL
278
15,825
0
12 Nov 2013
1