Contrastive Explanations for Model Interpretability

2 March 2021

Yejin Choi

Papers citing "Contrastive Explanations for Model Interpretability"

50 / 62 papers shown

Title
Comparative Explanations: Explanation Guided Decision Making for Human-in-the-Loop Preference Selection Tanmay Chakraborty Christian Wirth Christin Seifert 26 0 0 01 Apr 2025
Conceptual Contrastive Edits in Textual and Vision-Language Retrieval Maria Lymperaiou Giorgos Stamou VLM 55 0 0 01 Mar 2025
Comparing zero-shot self-explanations with human rationales in text classification Stephanie Brandl Oliver Eberle 60 0 0 24 Feb 2025
Is Conversational XAI All You Need? Human-AI Decision Making With a Conversational XAI Assistant Gaole He Nilay Aishwarya U. Gadiraju 38 6 0 29 Jan 2025
A Comparative Analysis of Counterfactual Explanation Methods for Text Classifiers Stephen McAleese Mark Keane 26 0 0 04 Nov 2024
Contrastive Explanations That Anticipate Human Misconceptions Can Improve Human Decision-Making Skills Zana Buçinca S. Swaroop Amanda E. Paluch Finale Doshi-Velez Krzysztof Z. Gajos 48 2 0 05 Oct 2024
CELL your Model: Contrastive Explanations for Large Language Models Ronny Luss Erik Miehling Amit Dhurandhar 40 0 0 17 Jun 2024
Unveiling and Manipulating Prompt Influence in Large Language Models Zijian Feng Hanzhang Zhou Zixiao Zhu Junlang Qian Kezhi Mao 37 2 0 20 May 2024
ConPro: Learning Severity Representation for Medical Images using Contrastive Learning and Preference Optimization Hong Nguyen H. Nguyen Melinda Y. Chang Hieu H. Pham Shrikanth Narayanan Michael Pazzani 19 0 0 29 Apr 2024
Interactive Prompt Debugging with Sequence Salience Ian Tenney Ryan Mullins Bin Du Shree Pandya Minsuk Kahng Lucas Dixon LRM 24 1 0 11 Apr 2024
LLM Attributor: Interactive Visual Attribution for LLM Generation Seongmin Lee Zijie J. Wang Aishwarya Chakravarthy Alec Helbling Sheng-Hsuan Peng Mansi Phute Duen Horng Chau Minsuk Kahng 26 3 0 01 Apr 2024
Heterogeneous Contrastive Learning for Foundation Models and Beyond Lecheng Zheng Baoyu Jing Zihao Li Hanghang Tong Jingrui He VLM 26 19 0 30 Mar 2024
Visual Analytics for Fine-grained Text Classification Models and Datasets Munkhtulga Battogtokh Y. Xing Cosmin Davidescu Alfie Abdul-Rahman Michael Luck Rita Borgo 31 0 0 21 Mar 2024
RORA: Robust Free-Text Rationale Evaluation Zhengping Jiang Yining Lu Hanjie Chen Daniel Khashabi Benjamin Van Durme Anqi Liu 43 1 0 28 Feb 2024
Explaining Probabilistic Models with Distributional Values Luca Franceschi Michele Donini Cédric Archambeau Matthias Seeger FAtt 21 2 0 15 Feb 2024
Observable Propagation: Uncovering Feature Vectors in Transformers Jacob Dunefsky Arman Cohan 33 2 0 26 Dec 2023
Navigating the Structured What-If Spaces: Counterfactual Generation via Structured Diffusion Nishtha Madaan Srikanta J. Bedathur DiffM 25 0 0 21 Dec 2023
TextGenSHAP: Scalable Post-hoc Explanations in Text Generation with Long Documents James Enouen Hootan Nakhost Sayna Ebrahimi Sercan Ö. Arik Yan Liu Tomas Pfister 33 4 0 03 Dec 2023
XplainLLM: A QA Explanation Dataset for Understanding LLM Decision-Making Zichen Chen Jianda Chen Mitali Gaidhani Ambuj K. Singh Misha Sra 25 4 0 15 Nov 2023
DistillCSE: Distilled Contrastive Learning for Sentence Embeddings Jiahao Xu Wei Shao Lihui Chen Lemao Liu FedML 18 4 0 20 Oct 2023
Rather a Nurse than a Physician -- Contrastive Explanations under Investigation Oliver Eberle Ilias Chalkidis Laura Cabello Stephanie Brandl 22 9 0 18 Oct 2023
Disentangling the Linguistic Competence of Privacy-Preserving BERT Stefan Arnold Nils Kemmerzell Annika Schreiner 25 0 0 17 Oct 2023
LIPEx-Locally Interpretable Probabilistic Explanations-To Look Beyond The True Class Hongbo Zhu Angelo Cangelosi Procheta Sen Anirbit Mukherjee FAtt 28 0 0 07 Oct 2023
Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features Eliana Pastor Alkis Koudounas Giuseppe Attanasio Dirk Hovy Elena Baralis 11 4 0 14 Sep 2023
A Geometric Notion of Causal Probing Clément Guerner Anej Svete Tianyu Liu Alex Warstadt Ryan Cotterell LLMSV 34 12 0 27 Jul 2023
Analyzing Chain-of-Thought Prompting in Large Language Models via Gradient-based Feature Attributions Skyler Wu Eric Meng Shen Charumathi Badrinath Jiaqi Ma Himabindu Lakkaraju LRM 24 26 0 25 Jul 2023
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations Yanda Chen Ruiqi Zhong Narutatsu Ri Chen Zhao He He Jacob Steinhardt Zhou Yu Kathleen McKeown LRM 24 47 0 17 Jul 2023
CLIMAX: An exploration of Classifier-Based Contrastive Explanations Praharsh Nanavati Ranjitha Prasad 32 0 0 02 Jul 2023
Two-Stage Holistic and Contrastive Explanation of Image Classification Weiyan Xie Xiao-hui Li Zhi Lin Leonard K. M. Poon Caleb Chen Cao N. Zhang 19 2 0 10 Jun 2023
Empirical Sufficiency Lower Bounds for Language Modeling with Locally-Bootstrapped Semantic Structures Jakob Prange Emmanuele Chersoni 29 0 0 30 May 2023
Faithfulness Tests for Natural Language Explanations Pepa Atanasova Oana-Maria Camburu Christina Lioma Thomas Lukasiewicz J. Simonsen Isabelle Augenstein FAtt 21 59 0 29 May 2023
Learning to Generalize for Cross-domain QA Yingjie Niu Linyi Yang Ruihai Dong Yue Zhang 11 6 0 14 May 2023
Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering Qianglong Chen Guohai Xu Mingshi Yan Ji Zhang Fei Huang Luo Si Yin Zhang 13 9 0 14 May 2023
Surfacing Biases in Large Language Models using Contrastive Input Decoding G. Yona Or Honovich Itay Laish Roee Aharoni 27 11 0 12 May 2023
Explaining Model Confidence Using Counterfactuals Thao Le Tim Miller Ronal Singh L. Sonenberg 14 2 0 10 Mar 2023
Signed Directed Graph Contrastive Learning with Laplacian Augmentation Taewook Ko Y. Choi Chong-Kwon Kim 26 3 0 12 Jan 2023
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning O. Yu. Golovneva Moya Chen Spencer Poff Martin Corredor Luke Zettlemoyer Maryam Fazel-Zarandi Asli Celikyilmaz ReLM LRM 20 137 0 15 Dec 2022
CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class Classification Y. Li Canran Xu Guodong Long Tao Shen Chongyang Tao Jing Jiang 38 1 0 11 Nov 2022
A General Search-based Framework for Generating Textual Counterfactual Explanations Daniel Gilo Shaul Markovitch LRM 16 0 0 01 Nov 2022
Does Self-Rationalization Improve Robustness to Spurious Correlations? Alexis Ross Matthew E. Peters Ana Marasović LRM 19 11 0 24 Oct 2022
Lexical Generalization Improves with Larger Models and Longer Training Elron Bandel Yoav Goldberg Yanai Elazar 47 6 0 23 Oct 2022
Log-linear Guardedness and its Implications Shauli Ravfogel Yoav Goldberg Ryan Cotterell 25 2 0 18 Oct 2022
Beyond Model Interpretability: On the Faithfulness and Adversarial Robustness of Contrastive Textual Explanations Julia El Zini M. Awad AAML 16 2 0 17 Oct 2022
Contrastive Corpus Attribution for Explaining Representations Christy Lin Hugh Chen Chanwoo Kim Su-In Lee SSL 17 8 0 30 Sep 2022
Towards Faithful Model Explanation in NLP: A Survey Qing Lyu Marianna Apidianaki Chris Callison-Burch XAI 104 107 0 22 Sep 2022
Policy Optimization with Sparse Global Contrastive Explanations Jiayu Yao S. Parbhoo Weiwei Pan Finale Doshi-Velez OffRL 9 1 0 13 Jul 2022
Probing Classifiers are Unreliable for Concept Removal and Detection Abhinav Kumar Chenhao Tan Amit Sharma AAML 15 20 0 08 Jul 2022
Improving Model Understanding and Trust with Counterfactual Explanations of Model Confidence Thao Le Tim Miller Ronal Singh L. Sonenberg 12 9 0 06 Jun 2022
Investigating the Benefits of Free-Form Rationales Jiao Sun Swabha Swayamdipta Jonathan May Xuezhe Ma 11 14 0 25 May 2022
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations Jaehun Jung Lianhui Qin Sean Welleck Faeze Brahman Chandra Bhagavatula Ronan Le Bras Yejin Choi ReLM LRM 218 189 0 24 May 2022