ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.12560
  4. Cited By
CausalGym: Benchmarking causal interpretability methods on linguistic
  tasks

CausalGym: Benchmarking causal interpretability methods on linguistic tasks

19 February 2024
Aryaman Arora
Daniel Jurafsky
Christopher Potts
ArXivPDFHTML

Papers citing "CausalGym: Benchmarking causal interpretability methods on linguistic tasks"

8 / 8 papers shown
Title
Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages
Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages
Jannik Brinkmann
Chris Wendler
Christian Bartelt
Aaron Mueller
35
9
0
10 Jan 2025
Language models align with human judgments on key grammatical
  constructions
Language models align with human judgments on key grammatical constructions
Jennifer Hu
Kyle Mahowald
G. Lupyan
Anna A. Ivanova
Roger Levy
22
10
0
19 Jan 2024
The Geometry of Truth: Emergent Linear Structure in Large Language Model
  Representations of True/False Datasets
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets
Samuel Marks
Max Tegmark
HILM
83
164
0
10 Oct 2023
A Geometric Notion of Causal Probing
A Geometric Notion of Causal Probing
Clément Guerner
Anej Svete
Tianyu Liu
Alex Warstadt
Ryan Cotterell
LLMSV
24
12
0
27 Jul 2023
Finding Alignments Between Interpretable Causal Variables and
  Distributed Neural Representations
Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations
Atticus Geiger
Zhengxuan Wu
Christopher Potts
Thomas F. Icard
Noah D. Goodman
CML
73
98
0
05 Mar 2023
Interpretability in the Wild: a Circuit for Indirect Object
  Identification in GPT-2 small
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
205
486
0
01 Nov 2022
Naturalistic Causal Probing for Morpho-Syntax
Naturalistic Causal Probing for Morpho-Syntax
Afra Amini
Tiago Pimentel
Clara Meister
Ryan Cotterell
MILM
93
13
0
14 May 2022
Probing Classifiers: Promises, Shortcomings, and Advances
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
216
291
0
24 Feb 2021
1