Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information

24 August 2018

Papers citing "Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information"

39 / 39 papers shown

Title
Large Language Models Share Representations of Latent Grammatical Concepts Across Typologically Diverse Languages Jannik Brinkmann Chris Wendler Christian Bartelt Aaron Mueller 51 9 0 10 Jan 2025
Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models Xinyu Zhou Delong Chen Samuel Cahyawijaya Xufeng Duan Zhenguang G. Cai 32 1 0 19 Sep 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs Nitay Calderon Roi Reichart 40 10 0 27 Jul 2024
Does ChatGPT Have a Mind? Simon Goldstein B. Levinstein AI4MH LRM 34 5 0 27 Jun 2024
What Do VLMs NOTICE? A Mechanistic Interpretability Pipeline for Gaussian-Noise-free Text-Image Corruption and Evaluation Michal Golovanevsky William Rudman Vedant Palit Ritambhara Singh Carsten Eickhoff 33 1 0 24 Jun 2024
Perception of Phonological Assimilation by Neural Speech Recognition Models Charlotte Pouw Marianne de Heer Kloots A. Alishahi Willem H. Zuidema 45 2 0 21 Jun 2024
Codebook Features: Sparse and Discrete Interpretability for Neural Networks Alex Tamkin Mohammad Taufeeque Noah D. Goodman 32 27 0 26 Oct 2023
Investigating semantic subspaces of Transformer sentence embeddings through linear structural probing Dmitry Nikolaev Sebastian Padó 46 5 0 18 Oct 2023
Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP Vedant Palit Rohan Pandey Aryaman Arora Paul Pu Liang 31 20 0 27 Aug 2023
Exposing the Functionalities of Neurons for Gated Recurrent Unit Based Sequence-to-Sequence Model Yi-Ting Lee Da-Yi Wu Chih-Chun Yang Shou-De Lin MILM 24 0 0 27 Mar 2023
An Overview on Language Models: Recent Developments and Outlook Chengwei Wei Yun Cheng Wang Bin Wang C.-C. Jay Kuo 25 42 0 10 Mar 2023
Dissociating language and thought in large language models Kyle Mahowald Anna A. Ivanova I. Blank Nancy Kanwisher J. Tenenbaum Evelina Fedorenko ELM ReLM 29 209 0 16 Jan 2023
Do LSTMs See Gender? Probing the Ability of LSTMs to Learn Abstract Syntactic Rules Priyanka Sukumaran Conor J. Houghton N. Kazanina 14 4 0 31 Oct 2022
Understanding Domain Learning in Language Models Through Subpopulation Analysis Zheng Zhao Yftah Ziser Shay B. Cohen 34 6 0 22 Oct 2022
Probing with Noise: Unpicking the Warp and Weft of Embeddings Filip Klubicka John D. Kelleher 30 4 0 21 Oct 2022
Assessing Neural Referential Form Selectors on a Realistic Multilingual Dataset Guanyi Chen F. Same Kees van Deemter 18 0 0 10 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review Dieuwke Hupkes Mario Giulianelli Verna Dankers Mikel Artetxe Yanai Elazar ... Leila Khalatbari Maria Ryskina Rita Frieske Ryan Cotterell Zhijing Jin 114 93 0 06 Oct 2022
Probing via Prompting Jiaoda Li Ryan Cotterell Mrinmaya Sachan 29 13 0 04 Jul 2022
Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences Mark Anderson Jose Camacho-Collados 32 0 0 16 May 2022
Interpretation of Black Box NLP Models: A Survey Shivani Choudhary N. Chatterjee S. K. Saha FAtt 34 10 0 31 Mar 2022
Sparse Interventions in Language Models with Differentiable Masking Nicola De Cao Leon Schmid Dieuwke Hupkes Ivan Titov 35 27 0 13 Dec 2021
Inducing Causal Structure for Interpretable Neural Networks Atticus Geiger Zhengxuan Wu Hanson Lu J. Rozner Elisa Kreiss Thomas F. Icard Noah D. Goodman Christopher Potts CML OOD 24 70 0 01 Dec 2021
On the Pitfalls of Analyzing Individual Neurons in Language Models Omer Antverg Yonatan Belinkov MILM 22 49 0 14 Oct 2021
Causal Transformers Perform Below Chance on Recursive Nested Constructions, Unlike Humans Yair Lakretz T. Desbordes Dieuwke Hupkes S. Dehaene 233 11 0 14 Oct 2021
Analysing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets Changchang Zeng Shaobo Li 16 6 0 29 Sep 2021
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning Colin Wei Sang Michael Xie Tengyu Ma 24 96 0 17 Jun 2021
Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little Koustuv Sinha Robin Jia Dieuwke Hupkes J. Pineau Adina Williams Douwe Kiela 45 243 0 14 Apr 2021
Local Interpretations for Explainable Natural Language Processing: A Survey Siwen Luo Hamish Ivison S. Han Josiah Poon MILM 33 48 0 20 Mar 2021
Contrastive Explanations for Model Interpretability Alon Jacovi Swabha Swayamdipta Shauli Ravfogel Yanai Elazar Yejin Choi Yoav Goldberg 35 95 0 02 Mar 2021
Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking M. Schlichtkrull Nicola De Cao Ivan Titov AI4CE 31 214 0 01 Oct 2020
Examining the rhetorical capacities of neural language models Zining Zhu Chuer Pan Mohamed Abdalla Frank Rudzicz 28 10 0 01 Oct 2020
A Systematic Assessment of Syntactic Generalization in Neural Language Models Jennifer Hu Jon Gauthier Peng Qian Ethan Gotlieb Wilcox R. Levy ELM 21 212 0 07 May 2020
Probing Linguistic Features of Sentence-Level Representations in Neural Relation Extraction Christoph Alt Aleksandra Gabryszak Leonhard Hennig NAI 13 34 0 17 Apr 2020
Discovering the Compositional Structure of Vector Representations with Role Learning Networks Paul Soulos R. Thomas McCoy Tal Linzen P. Smolensky CoGe 29 43 0 21 Oct 2019
Analysing Neural Language Models: Contextual Decomposition Reveals Default Reasoning in Number and Gender Assignment Jaap Jumelet Willem H. Zuidema Dieuwke Hupkes LRM 25 37 0 19 Sep 2019
What Does BERT Look At? An Analysis of BERT's Attention Kevin Clark Urvashi Khandelwal Omer Levy Christopher D. Manning MILM 48 1,580 0 11 Jun 2019
Probing What Different NLP Tasks Teach Machines about Function Word Comprehension Najoung Kim Roma Patel Adam Poliak Alex Jinpeng Wang Patrick Xia ... Alexis Ross Tal Linzen Benjamin Van Durme Samuel R. Bowman Ellie Pavlick 20 105 0 25 Apr 2019
Still a Pain in the Neck: Evaluating Text Representations on Lexical Composition Vered Shwartz Ido Dagan CoGe 19 78 0 27 Feb 2019
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 201 882 0 03 May 2018