Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.06087
Cited By
Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models
10 June 2021
Matthew Finlayson
Aaron Mueller
Sebastian Gehrmann
Stuart M. Shieber
Tal Linzen
Yonatan Belinkov
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models"
25 / 25 papers shown
Title
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
Tyler A. Chang
Benjamin Bergen
50
0
0
21 Apr 2025
MIB: A Mechanistic Interpretability Benchmark
Aaron Mueller
Atticus Geiger
Sarah Wiegreffe
Dana Arad
Iván Arcuschin
...
Alessandro Stolfo
Martin Tutek
Amir Zur
David Bau
Yonatan Belinkov
43
1
0
17 Apr 2025
Are formal and functional linguistic mechanisms dissociated in language models?
Michael Hanna
Sandro Pezzelle
Yonatan Belinkov
47
0
0
14 Mar 2025
HyperDAS: Towards Automating Mechanistic Interpretability with Hypernetworks
Jiuding Sun
Jing Huang
Sidharth Baskaran
Karel DÓosterlinck
Christopher Potts
Michael Sklar
Atticus Geiger
AI4CE
68
0
0
13 Mar 2025
Linguistically Grounded Analysis of Language Models using Shapley Head Values
Marcell Richard Fekete
Johannes Bjerva
31
0
0
17 Oct 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
38
10
0
27 Jul 2024
What does the Knowledge Neuron Thesis Have to do with Knowledge?
Jingcheng Niu
Andrew Liu
Zining Zhu
Gerald Penn
48
30
0
03 May 2024
On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL
Yutong Shao
N. Nakashole
22
1
0
03 Apr 2024
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Samuel Marks
Can Rager
Eric J. Michaud
Yonatan Belinkov
David Bau
Aaron Mueller
44
111
0
28 Mar 2024
Identifying Linear Relational Concepts in Large Language Models
David Chanin
Anthony Hunter
Oana-Maria Camburu
LLMSV
KELM
18
4
0
15 Nov 2023
Towards Best Practices of Activation Patching in Language Models: Metrics and Methods
Fred Zhang
Neel Nanda
LLMSV
28
97
0
27 Sep 2023
Generalizing Backpropagation for Gradient-Based Interpretability
Kevin Du
Lucas Torroba Hennigen
Niklas Stoehr
Alex Warstadt
Ryan Cotterell
MILM
FAtt
24
7
0
06 Jul 2023
Causal interventions expose implicit situation models for commonsense language understanding
Takateru Yamakoshi
James L. McClelland
A. Goldberg
Robert D. Hawkins
17
6
0
06 Jun 2023
Localizing Model Behavior with Path Patching
Nicholas W. Goldowsky-Dill
Chris MacLeod
L. Sato
Aryaman Arora
21
85
0
12 Apr 2023
Dissociating language and thought in large language models
Kyle Mahowald
Anna A. Ivanova
I. Blank
Nancy Kanwisher
J. Tenenbaum
Evelina Fedorenko
ELM
ReLM
25
209
0
16 Jan 2023
Deep Causal Learning: Representation, Discovery and Inference
Zizhen Deng
Xiaolong Zheng
Hu Tian
D. Zeng
CML
BDL
31
11
0
07 Nov 2022
Testing Pre-trained Language Models' Understanding of Distributivity via Causal Mediation Analysis
Pangbo Ban
Yifan Jiang
Tianran Liu
Shane Steinert-Threlkeld
48
4
0
11 Sep 2022
Interpretation of Black Box NLP Models: A Survey
Shivani Choudhary
N. Chatterjee
S. K. Saha
FAtt
32
10
0
31 Mar 2022
Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models
Aaron Mueller
Robert Frank
Tal Linzen
Luheng Wang
Sebastian Schuster
AIMat
19
33
0
17 Mar 2022
Interpreting Language Models with Contrastive Explanations
Kayo Yin
Graham Neubig
MILM
21
77
0
21 Feb 2022
Locating and Editing Factual Associations in GPT
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
56
1,184
0
10 Feb 2022
Sparse Interventions in Language Models with Differentiable Masking
Nicola De Cao
Leon Schmid
Dieuwke Hupkes
Ivan Titov
32
27
0
13 Dec 2021
Text as Causal Mediators: Research Design for Causal Estimates of Differential Treatment of Social Groups via Language Aspects
Katherine A. Keith
Douglas Rice
Brendan T. O'Connor
CML
19
3
0
15 Sep 2021
Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond
Amir Feder
Katherine A. Keith
Emaad A. Manzoor
Reid Pryzant
Dhanya Sridhar
...
Roi Reichart
Margaret E. Roberts
Brandon M Stewart
Victor Veitch
Diyi Yang
CML
35
234
0
02 Sep 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1