Causal interventions expose implicit situation models for commonsense
language understanding

Causal interventions expose implicit situation models for commonsense language understanding

6 June 2023

Takateru Yamakoshi

James L. McClelland

Robert D. Hawkins

Papers citing "Causal interventions expose implicit situation models for commonsense language understanding"

12 / 12 papers shown

Title
Perception of Phonological Assimilation by Neural Speech Recognition Models Charlotte Pouw Marianne de Heer Kloots A. Alishahi Willem H. Zuidema 25 2 0 21 Jun 2024
GPT-ology, Computational Models, Silicon Sampling: How should we think about LLMs in Cognitive Science? Desmond C. Ong 28 3 0 13 Jun 2024
ReFT: Representation Finetuning for Language Models Zhengxuan Wu Aryaman Arora Zheng Wang Atticus Geiger Daniel Jurafsky Christopher D. Manning Christopher Potts OffRL 30 58 0 04 Apr 2024
On the Tip of the Tongue: Analyzing Conceptual Representation in Large Language Models with Reverse-Dictionary Probe Ningyu Xu Qi Zhang Menghan Zhang Peng Qian Xuanjing Huang LRM 51 3 0 22 Feb 2024
CausalGym: Benchmarking causal interpretability methods on linguistic tasks Aryaman Arora Daniel Jurafsky Christopher Potts 40 18 0 19 Feb 2024
Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations Atticus Geiger Zhengxuan Wu Christopher Potts Thomas F. Icard Noah D. Goodman CML 73 98 0 05 Mar 2023
Quantifying Context Mixing in Transformers Hosein Mohebbi Willem H. Zuidema Grzegorz Chrupała A. Alishahi 164 24 0 30 Jan 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small Kevin Wang Alexandre Variengien Arthur Conmy Buck Shlegeris Jacob Steinhardt 207 486 0 01 Nov 2022
In-context Learning and Induction Heads Catherine Olsson Nelson Elhage Neel Nanda Nicholas Joseph Nova Dassarma ... Tom B. Brown Jack Clark Jared Kaplan Sam McCandlish C. Olah 234 453 0 24 Sep 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 315 8,261 0 28 Jan 2022
The Sensitivity of Language Models and Humans to Winograd Schema Perturbations Mostafa Abdou Vinit Ravishankar Maria Barrett Yonatan Belinkov Desmond Elliott Anders Søgaard ReLM LRM 29 33 0 04 May 2020
Language Models as Knowledge Bases? Fabio Petroni Tim Rocktaschel Patrick Lewis A. Bakhtin Yuxiang Wu Alexander H. Miller Sebastian Riedel KELM AI4MH 391 2,216 0 03 Sep 2019