Explaining NLP Models via Minimal Contrastive Editing (MiCE)

27 December 2020

Papers citing "Explaining NLP Models via Minimal Contrastive Editing (MiCE)"

30 / 30 papers shown

Title
Short-circuiting Shortcuts: Mechanistic Investigation of Shortcuts in Text Classification Leon Eshuijs Shihan Wang Antske Fokkens 26 0 0 09 May 2025
FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation Qianli Wang Nils Feldhus Simon Ostermann Luis Felipe Villa-Arenas Sebastian Möller Vera Schmitt AAML 34 0 0 01 Jan 2025
Counterfactuals As a Means for Evaluating Faithfulness of Attribution Methods in Autoregressive Language Models Sepehr Kamahi Yadollah Yaghoobzadeh 32 0 0 21 Aug 2024
CELL your Model: Contrastive Explanations for Large Language Models Ronny Luss Erik Miehling Amit Dhurandhar 40 0 0 17 Jun 2024
Interpreting Pretrained Language Models via Concept Bottlenecks Zhen Tan Lu Cheng Song Wang Yuan Bo Jundong Li Huan Liu LRM 29 20 0 08 Nov 2023
SMoA: Sparse Mixture of Adapters to Mitigate Multiple Dataset Biases Yanchen Liu Jing Yang Yan Chen Jing Liu Huaqin Wu MoE 34 2 0 28 Feb 2023
Sequentially Controlled Text Generation Alexander Spangher Xinyu Hua Yao Ming Nanyun Peng 24 7 0 05 Jan 2023
CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class Classification Y. Li Canran Xu Guodong Long Tao Shen Chongyang Tao Jing Jiang 38 1 0 11 Nov 2022
A General Search-based Framework for Generating Textual Counterfactual Explanations Daniel Gilo Shaul Markovitch LRM 22 0 0 01 Nov 2022
Finding Dataset Shortcuts with Grammar Induction Dan Friedman Alexander Wettig Danqi Chen 16 14 0 20 Oct 2022
On the Explainability of Natural Language Processing Deep Models Julia El Zini M. Awad 25 82 0 13 Oct 2022
CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation Tanay Dixit Bhargavi Paranjape Hannaneh Hajishirzi Luke Zettlemoyer SyDa 138 23 0 10 Oct 2022
ferret: a Framework for Benchmarking Explainers on Transformers Giuseppe Attanasio Eliana Pastor C. Bonaventura Debora Nozza 28 30 0 02 Aug 2022
Challenges in Applying Explainability Methods to Improve the Fairness of NLP Models Esma Balkir S. Kiritchenko I. Nejadgholi Kathleen C. Fraser 16 36 0 08 Jun 2022
Argumentative Explanations for Pattern-Based Text Classifiers Piyawat Lertvittayakumjorn Francesca Toni 27 4 0 22 May 2022
Necessity and Sufficiency for Explaining Text Classifiers: A Case Study in Hate Speech Detection Esma Balkir I. Nejadgholi Kathleen C. Fraser S. Kiritchenko FAtt 33 27 0 06 May 2022
ExSum: From Local Explanations to Model Understanding Yilun Zhou Marco Tulio Ribeiro J. Shah FAtt LRM 11 25 0 30 Apr 2022
Interpretation of Black Box NLP Models: A Survey Shivani Choudhary N. Chatterjee S. K. Saha FAtt 28 10 0 31 Mar 2022
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey Bonan Min Hayley L Ross Elior Sulem Amir Pouran Ben Veyseh Thien Huu Nguyen Oscar Sainz Eneko Agirre Ilana Heinz Dan Roth LM&MA VLM AI4CE 55 1,029 0 01 Nov 2021
Understanding Interlocking Dynamics of Cooperative Rationalization Mo Yu Yang Zhang Shiyu Chang Tommi Jaakkola 16 41 0 26 Oct 2021
Let the CAT out of the bag: Contrastive Attributed explanations for Text Saneem A. Chemmengath A. Azad Ronny Luss Amit Dhurandhar FAtt 26 10 0 16 Sep 2021
Synthesizing Adversarial Negative Responses for Robust Response Ranking and Evaluation Prakhar Gupta Yulia Tsvetkov Jeffrey P. Bigham 34 22 0 10 Jun 2021
Controlling Text Edition by Changing Answers of Specific Questions Lei Sha Patrick Hohenecker Thomas Lukasiewicz 111 7 0 23 May 2021
Contrastive Explanations for Model Interpretability Alon Jacovi Swabha Swayamdipta Shauli Ravfogel Yanai Elazar Yejin Choi Yoav Goldberg 33 95 0 02 Mar 2021
A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned and Perspectives Nils Rethmeier Isabelle Augenstein SSL VLM 85 90 0 25 Feb 2021
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models Tongshuang Wu Marco Tulio Ribeiro Jeffrey Heer Daniel S. Weld 30 240 0 01 Jan 2021
Linguistically-Informed Transformations (LIT): A Method for Automatically Generating Contrast Sets Chuanrong Li Lin Shengshuo Leo Z. Liu Xinyi Wu Xuhui Zhou Shane Steinert-Threlkeld VLM 128 38 0 16 Oct 2020
e-SNLI: Natural Language Inference with Natural Language Explanations Oana-Maria Camburu Tim Rocktaschel Thomas Lukasiewicz Phil Blunsom LRM 255 620 0 04 Dec 2018
Generating Natural Language Adversarial Examples M. Alzantot Yash Sharma Ahmed Elgohary Bo-Jhang Ho Mani B. Srivastava Kai-Wei Chang AAML 245 914 0 21 Apr 2018
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks Mohit Iyyer John Wieting Kevin Gimpel Luke Zettlemoyer AAML GAN 185 711 0 17 Apr 2018