Designing and Interpreting Probes with Control Tasks

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019

8 September 2019

Papers citing "Designing and Interpreting Probes with Control Tasks"

50 / 381 papers shown

Title
All Roads Lead to Rome? Exploring the Invariance of Transformers' Representations Yuxin Ren Qipeng Guo Zhijing Jin Shauli Ravfogel Mrinmaya Sachan Bernhard Schölkopf Robert Bamler 132 5 0 23 May 2023
Can LLMs facilitate interpretation of pre-trained language models?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Basel Mousi Nadir Durrani Fahim Dalvi 296 15 0 22 May 2023
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language ModelsInternational Conference on Language Resources and Evaluation (LREC), 2023 Oana Ignat Zhijing Jin Artem Abzaliev Laura Biester Santiago Castro ... Verónica Pérez-Rosas Siqi Shen Zekun Wang Winston Wu Amélie Reymond LRM 316 8 0 21 May 2023
Finding Neurons in a Haystack: Case Studies with Sparse Probing Wes Gurnee Neel Nanda Matthew Pauly Katherine Harvey Dmitrii Troitskii Dimitris Bertsimas MILM 511 283 0 02 May 2023
Redundancy and Concept Analysis for Code-trained Language Models Arushi Sharma Zefu Hu Christopher Quinn Ali Jannesari 248 3 0 01 May 2023
The Closeness of In-Context Learning and Weight Shifting for Softmax RegressionNeural Information Processing Systems (NeurIPS), 2023 Shuai Li Zhao Song Yu Xia Tong Yu Wanrong Zhu 172 49 0 26 Apr 2023
Interventional Probing in High Dimensions: An NLI Case StudyFindings (Findings), 2023 Julia Rozanova Marco Valentino Lucas C. Cordeiro André Freitas 87 8 0 20 Apr 2023
Inspecting and Editing Knowledge Representations in Language Models Evan Hernandez Belinda Z. Li Jacob Andreas KELM 293 121 0 03 Apr 2023
Do Transformers Parse while Predicting the Masked Word?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Haoyu Zhao A. Panigrahi Rong Ge Sanjeev Arora 318 39 0 14 Mar 2023
A Theory of Emergent In-Context Learning as Implicit Structure Induction Michael Hahn Navin Goyal LRM 230 99 0 14 Mar 2023
Eliciting Latent Predictions from Transformers with the Tuned Lens Nora Belrose Zach Furman Logan Smith Danny Halawi Igor V. Ostrovsky Lev McKinney Stella Biderman Jacob Steinhardt 564 310 0 14 Mar 2023
How Do Transformers Learn Topic Structure: Towards a Mechanistic UnderstandingInternational Conference on Machine Learning (ICML), 2023 Yuchen Li Yuan-Fang Li Andrej Risteski 357 79 0 07 Mar 2023
DeepLens: Interactive Out-of-distribution Data Detection in NLP ModelsInternational Conference on Human Factors in Computing Systems (CHI), 2023 D. Song Zhijie Wang Yuheng Huang Lei Ma Tianyi Zhang 139 8 0 02 Mar 2023
Can BERT Refrain from Forgetting on Sequential Tasks? A Probing StudyInternational Conference on Learning Representations (ICLR), 2023 Mingxu Tao Yansong Feng Dongyan Zhao CLL KELM 164 11 0 02 Mar 2023
Competence-Based Analysis of Language Models Adam Davies Jize Jiang Chengxiang Zhai ELM 357 7 0 01 Mar 2023
Does Deep Learning Learn to Abstract? A Systematic Probing FrameworkInternational Conference on Learning Representations (ICLR), 2023 Shengnan An Zeqi Lin B. Chen Qiang Fu Nanning Zheng Jian-Guang Lou 214 6 0 23 Feb 2023
Evaluating Representations with Readout Model SwitchingInternational Conference on Learning Representations (ICLR), 2023 Yazhe Li J. Bornschein Marcus Hutter 156 1 0 19 Feb 2023
Trust, but Verify: Using Self-Supervised Probing to Improve TrustworthinessEuropean Conference on Computer Vision (ECCV), 2023 Ailin Deng Shen Li Miao Xiong Zhirui Chen Bryan Hooi 148 4 0 06 Feb 2023
Evaluating Neuron Interpretation Methods of NLP ModelsNeural Information Processing Systems (NeurIPS), 2023 Yimin Fan Fahim Dalvi Nadir Durrani Hassan Sajjad 257 9 0 30 Jan 2023
Dissociating language and thought in large language models Kyle Mahowald Anna A. Ivanova I. Blank Nancy Kanwisher J. Tenenbaum Evelina Fedorenko ELM ReLM 284 229 0 16 Jan 2023
Rationalizing Predictions by Adversarial Information CalibrationArtificial Intelligence (AI), 2022 Lei Sha Oana-Maria Camburu Thomas Lukasiewicz 173 9 0 15 Jan 2023
Removing Non-Stationary Knowledge From Pre-Trained Language Models for Entity-Level Sentiment Classification in Finance Seunghyeok Hong Hanwool Albert Lee Nahyeon Kang Moonjeong Hahm 197 10 0 09 Jan 2023
Can Large Language Models Change User Preference Adversarially? Varshini Subhash AAML 158 9 0 05 Jan 2023
Cross-Linguistic Syntactic Difference in Multilingual BERT: How Good is It and How Does It Affect Transfer?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Ningyu Xu Tao Gui Ruotian Ma Tao Gui Jingting Ye Menghan Zhang Xuanjing Huang 217 14 0 21 Dec 2022
Analyzing Semantic Faithfulness of Language Models via Input Intervention on Question AnsweringInternational Conference on Computational Logic (ICCL), 2022 Akshay Chaturvedi Swarnadeep Bhar Soumadeep Saha Utpal Garain Nicholas Asher 217 8 0 21 Dec 2022
Trustworthy Social Bias MeasurementAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2022 Rishi Bommasani Abigail Z. Jacobs 235 13 0 20 Dec 2022
CREPE: Can Vision-Language Foundation Models Reason Compositionally?Computer Vision and Pattern Recognition (CVPR), 2022 Zixian Ma Jerry Hong Mustafa Omer Gul Mona Gandhi Irena Gao Ranjay Krishna CoGe 363 179 0 13 Dec 2022
Assessing the Capacity of Transformer to Abstract Syntactic Representations: A Contrastive Analysis Based on Long-distance AgreementTransactions of the Association for Computational Linguistics (TACL), 2022 Bingzhi Li Guillaume Wisniewski Benoît Crabbé 265 16 0 08 Dec 2022
Intermediate Entity-based Sparse Interpretable Representation LearningBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2022 Diego Garcia-Olano Yasumasa Onoe Joydeep Ghosh Byron C. Wallace 214 2 0 03 Dec 2022
Localization vs. Semantics: Visual Representations in Unimodal and Multimodal ModelsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022 Zhuowan Li Cihang Xie Benjamin Van Durme Yaoyao Liu VLM SSL 139 2 0 01 Dec 2022
Penalizing Confident Predictions on Largely Perturbed Inputs Does Not Improve Out-of-Distribution Generalization in Question Answering Kazutoshi Shinoda Saku Sugawara Akiko Aizawa OOD AAML 121 1 0 29 Nov 2022
Evaluation Beyond Task Performance: Analyzing Concepts in AlphaZero in HexNeural Information Processing Systems (NeurIPS), 2022 Charles Lovering Jessica Zosa Forde George Konidaris Ellie Pavlick Michael L. Littman 111 12 0 26 Nov 2022
Probing for Incremental Parse States in Autoregressive Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Tiwalayo Eisape Vineet Gangireddy R. Levy Yoon Kim 229 16 0 17 Nov 2022
The Architectural Bottleneck PrincipleConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Tiago Pimentel Josef Valvoda Niklas Stoehr Robert Bamler 147 5 0 11 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model BigScience Workshop : Teven Le Scao Angela Fan Christopher Akiki ... Zhongli Xie Zifan Ye M. Bras Younes Belkada Thomas Wolf VLM 828 2,735 0 09 Nov 2022
SocioProbe: What, When, and Where Language Models Learn about SociodemographicsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Anne Lauscher Federico Bianchi Samuel R. Bowman Dirk Hovy 201 10 0 08 Nov 2022
Causal Analysis of Syntactic Agreement Neurons in Multilingual Language ModelsConference on Computational Natural Language Learning (CoNLL), 2022 Aaron Mueller Yudi Xia Tal Linzen MILM 221 13 0 25 Oct 2022
Universal and Independent: Multilingual Probing Framework for Exhaustive Model Interpretation and EvaluationBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2022 O. Serikov Vitaly Protasov E. Voloshina V. Knyazkova Tatiana Shavrina 160 4 0 24 Oct 2022
On the Transformation of Latent Space in Fine-Tuned NLP ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Nadir Durrani Hassan Sajjad Fahim Dalvi Firoj Alam 249 20 0 23 Oct 2022
Enhancing Tabular Reasoning with Pattern Exploiting Training Abhilash Shankarampeta Vivek Gupta Shuo Zhang LMTD RALM ReLM 274 6 0 21 Oct 2022
Probing with Noise: Unpicking the Warp and Weft of EmbeddingsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2022 Filip Klubicka John D. Kelleher 181 4 0 21 Oct 2022
Spectral ProbingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Max Müller-Eberstein Rob van der Goot Barbara Plank 101 2 0 21 Oct 2022
SLING: Sino Linguistic Evaluation of Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Yixiao Song Kalpesh Krishna R. Bhatt Mohit Iyyer 203 14 0 21 Oct 2022
Choose Your Lenses: Flaws in Gender Bias Evaluation Hadas Orgad Yonatan Belinkov 226 39 0 20 Oct 2022
Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Shuo Xie Jiahao Qiu Ankita Pasad Li Du Qing Qu Hongyuan Mei 212 16 0 18 Oct 2022
Post-hoc analysis of Arabic transformer modelsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2022 Ahmed Abdelali Nadir Durrani Fahim Dalvi Hassan Sajjad 112 1 0 18 Oct 2022
Transparency Helps Reveal When Language Models Learn MeaningTransactions of the Association for Computational Linguistics (TACL), 2022 Zhaofeng Wu William Merrill Hao Peng Iz Beltagy Noah A. Smith 311 11 0 14 Oct 2022
Assessing Neural Referential Form Selectors on a Realistic Multilingual Dataset Guanyi Chen F. Same Kees van Deemter 119 0 0 10 Oct 2022
COMPS: Conceptual Minimal Pair Sentences for testing Robust Property Knowledge and its Inheritance in Pre-trained Language ModelsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022 Kanishka Misra Julia Taylor Rayz Allyson Ettinger 453 17 0 05 Oct 2022
Towards Faithful Model Explanation in NLP: A SurveyComputational Linguistics (CL), 2022 Qing Lyu Marianna Apidianaki Chris Callison-Burch XAI 478 166 0 22 Sep 2022

All Papers

Designing and Interpreting Probes with Control Tasks

Papers citing "Designing and Interpreting Probes with Control Tasks"