Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models

10 June 2021

Papers citing "Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models"

25 / 25 papers shown

Title
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models Tyler A. Chang Benjamin Bergen 50 0 0 21 Apr 2025
MIB: A Mechanistic Interpretability Benchmark Aaron Mueller Atticus Geiger Sarah Wiegreffe Dana Arad Iván Arcuschin ... Alessandro Stolfo Martin Tutek Amir Zur David Bau Yonatan Belinkov 43 1 0 17 Apr 2025
Are formal and functional linguistic mechanisms dissociated in language models? Michael Hanna Sandro Pezzelle Yonatan Belinkov 47 0 0 14 Mar 2025
HyperDAS: Towards Automating Mechanistic Interpretability with Hypernetworks Jiuding Sun Jing Huang Sidharth Baskaran Karel DÓosterlinck Christopher Potts Michael Sklar Atticus Geiger AI4CE 68 0 0 13 Mar 2025
Linguistically Grounded Analysis of Language Models using Shapley Head Values Marcell Richard Fekete Johannes Bjerva 31 0 0 17 Oct 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs Nitay Calderon Roi Reichart 38 10 0 27 Jul 2024
What does the Knowledge Neuron Thesis Have to do with Knowledge? Jingcheng Niu Andrew Liu Zining Zhu Gerald Penn 48 30 0 03 May 2024
On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL Yutong Shao N. Nakashole 22 1 0 03 Apr 2024
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models Samuel Marks Can Rager Eric J. Michaud Yonatan Belinkov David Bau Aaron Mueller 44 111 0 28 Mar 2024
Identifying Linear Relational Concepts in Large Language Models David Chanin Anthony Hunter Oana-Maria Camburu LLMSV KELM 18 4 0 15 Nov 2023
Towards Best Practices of Activation Patching in Language Models: Metrics and Methods Fred Zhang Neel Nanda LLMSV 28 97 0 27 Sep 2023
Generalizing Backpropagation for Gradient-Based Interpretability Kevin Du Lucas Torroba Hennigen Niklas Stoehr Alex Warstadt Ryan Cotterell MILM FAtt 24 7 0 06 Jul 2023
Causal interventions expose implicit situation models for commonsense language understanding Takateru Yamakoshi James L. McClelland A. Goldberg Robert D. Hawkins 17 6 0 06 Jun 2023
Localizing Model Behavior with Path Patching Nicholas W. Goldowsky-Dill Chris MacLeod L. Sato Aryaman Arora 21 85 0 12 Apr 2023
Dissociating language and thought in large language models Kyle Mahowald Anna A. Ivanova I. Blank Nancy Kanwisher J. Tenenbaum Evelina Fedorenko ELM ReLM 25 209 0 16 Jan 2023
Deep Causal Learning: Representation, Discovery and Inference Zizhen Deng Xiaolong Zheng Hu Tian D. Zeng CML BDL 31 11 0 07 Nov 2022
Testing Pre-trained Language Models' Understanding of Distributivity via Causal Mediation Analysis Pangbo Ban Yifan Jiang Tianran Liu Shane Steinert-Threlkeld 48 4 0 11 Sep 2022
Interpretation of Black Box NLP Models: A Survey Shivani Choudhary N. Chatterjee S. K. Saha FAtt 32 10 0 31 Mar 2022
Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models Aaron Mueller Robert Frank Tal Linzen Luheng Wang Sebastian Schuster AIMat 19 33 0 17 Mar 2022
Interpreting Language Models with Contrastive Explanations Kayo Yin Graham Neubig MILM 21 77 0 21 Feb 2022
Locating and Editing Factual Associations in GPT Kevin Meng David Bau A. Andonian Yonatan Belinkov KELM 56 1,184 0 10 Feb 2022
Sparse Interventions in Language Models with Differentiable Masking Nicola De Cao Leon Schmid Dieuwke Hupkes Ivan Titov 32 27 0 13 Dec 2021
Text as Causal Mediators: Research Design for Causal Estimates of Differential Treatment of Social Groups via Language Aspects Katherine A. Keith Douglas Rice Brendan T. O'Connor CML 19 3 0 15 Sep 2021
Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond Amir Feder Katherine A. Keith Emaad A. Manzoor Reid Pryzant Dhanya Sridhar ... Roi Reichart Margaret E. Roberts Brandon M Stewart Victor Veitch Diyi Yang CML 35 234 0 02 Sep 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 297 6,956 0 20 Apr 2018