Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability

15 August 2024

Joakim Edin

Andreas Geert Motzfeldt

Casper L. Christensen

Papers citing "Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability"

35 / 35 papers shown

Title
GIM: Improved Interpretability for Large Language Models Joakim Edin Róbert Csordás Tuukka Ruotsalo Zhengxuan Wu Maria Maistro Jing-ling Huang Lars Maaløe 26 0 0 23 May 2025
Fool Me Once? Contrasting Textual and Visual Explanations in a Clinical Decision-Support Setting Maxime Kayser Bayar I. Menzat Cornelius Emde Bogdan Bercean Alex Novak Abdala Espinosa B. Papież Susanne Gaube Thomas Lukasiewicz Oana-Maria Camburu 67 4 0 16 Oct 2024
An Unsupervised Approach to Achieve Supervised-Level Explainability in Healthcare Records Joakim Edin Maria Maistro Lars Maaløe Lasse Borgholt Jakob Drachmann Havtorn Tuukka Ruotsalo FAtt 53 5 0 13 Jun 2024
Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales Lucas Resck Marcos M. Raimundo Jorge Poco 69 3 0 03 Apr 2024
Towards Faithful Explanations for Text Classification with Robustness Improvement and Explanation Guided Training Dongfang Li Baotian Hu Qingcai Chen Shan He 47 5 0 29 Dec 2023
DecompX: Explaining Transformers Decisions by Propagating Token Decomposition Ali Modarressi Mohsen Fayyaz Ehsan Aghazadeh Yadollah Yaghoobzadeh Mohammad Taher Pilehvar 57 27 0 05 Jun 2023
Incorporating Attribution Importance for Improving Faithfulness Metrics Zhixue Zhao Nikolaos Aletras 45 13 0 17 May 2023
EvalAttAI: A Holistic Approach to Evaluating Attribution Maps in Robust and Non-Robust Models Ian E. Nielsen Ravichandran Ramachandran N. Bouaynaya Hassan M. Fathallah-Shaykh Ghulam Rasool AAML FAtt 67 8 0 15 Mar 2023
Improving Interpretability via Explicit Word Interaction Graph Layer Arshdeep Sekhon Hanjie Chen A. Shrivastava Zhe Wang Yangfeng Ji Yanjun Qi AI4CE MILM 46 6 0 03 Feb 2023
Towards Faithful Model Explanation in NLP: A Survey Qing Lyu Marianna Apidianaki Chris Callison-Burch XAI 122 115 0 22 Sep 2022
The Solvability of Interpretability Evaluation Metrics Yilun Zhou J. Shah 78 8 0 18 May 2022
An Empirical Study on Explanations in Out-of-Domain Settings G. Chrysostomou Nikolaos Aletras LRM 17 28 0 28 Feb 2022
Logic Traps in Evaluating Attribution Scores Yiming Ju Yuanzhe Zhang Zhao Yang Zhongtao Jiang Kang Liu Jun Zhao XAI FAtt 43 18 0 12 Sep 2021
Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience G. Chrysostomou Nikolaos Aletras 42 17 0 31 Aug 2021
The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations Peter Hase Harry Xie Joey Tianyi Zhou OODD LRM FAtt 49 91 0 01 Jun 2021
On the Sensitivity and Stability of Model Interpretations in NLP Fan Yin Zhouxing Shi Cho-Jui Hsieh Kai-Wei Chang FAtt 27 33 0 18 Apr 2021
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods? Jasmijn Bastings Katja Filippova XAI LRM 64 177 0 12 Oct 2020
Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers Hanjie Chen Yangfeng Ji AAML VLM 46 63 0 01 Oct 2020
How does this interaction affect me? Interpretable attribution for feature interactions Michael Tsang Sirisha Rambhatla Yan Liu FAtt 35 87 0 19 Jun 2020
Evaluating and Aggregating Feature-based Model Explanations Umang Bhatt Adrian Weller J. M. F. Moura XAI 67 218 0 01 May 2020
Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness? Alon Jacovi Yoav Goldberg XAI 53 584 0 07 Apr 2020
ERASER: A Benchmark to Evaluate Rationalized NLP Models Jay DeYoung Sarthak Jain Nazneen Rajani Eric P. Lehman Caiming Xiong R. Socher Byron C. Wallace 57 631 0 08 Nov 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter Victor Sanh Lysandre Debut Julien Chaumond Thomas Wolf 34 7,386 0 02 Oct 2019
One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques Vijay Arya Rachel K. E. Bellamy Pin-Yu Chen Amit Dhurandhar Michael Hind ... Karthikeyan Shanmugam Moninder Singh Kush R. Varshney Dennis L. Wei Yunfeng Zhang XAI 13 391 0 06 Sep 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 289 24,058 0 26 Jul 2019
Attention is not Explanation Sarthak Jain Byron C. Wallace FAtt 64 1,307 0 26 Feb 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 294 93,660 0 11 Oct 2018
A Benchmark for Interpretability Methods in Deep Neural Networks Sara Hooker D. Erhan Pieter-Jan Kindermans Been Kim FAtt UQCV 52 673 0 28 Jun 2018
Pathologies of Neural Models Make Interpretations Difficult Shi Feng Eric Wallace Alvin Grissom II Mohit Iyyer Pedro Rodriguez Jordan L. Boyd-Graber AAML FAtt 26 317 0 20 Apr 2018
A Unified Approach to Interpreting Model Predictions Scott M. Lundberg Su-In Lee FAtt 34 21,459 0 22 May 2017
Learning Important Features Through Propagating Activation Differences Avanti Shrikumar Peyton Greenside A. Kundaje FAtt 42 3,848 0 10 Apr 2017
Axiomatic Attribution for Deep Networks Mukund Sundararajan Ankur Taly Qiqi Yan OOD FAtt 52 5,894 0 04 Mar 2017
Beam Search Strategies for Neural Machine Translation Markus Freitag Yaser Al-Onaizan 40 381 0 06 Feb 2017
Evaluating the visualization of what a Deep Neural Network has learned Wojciech Samek Alexander Binder G. Montavon Sebastian Lapuschkin K. Müller XAI 81 1,188 0 21 Sep 2015
Character-level Convolutional Networks for Text Classification Xiang Zhang Jiaqi Zhao Yann LeCun 142 6,046 0 04 Sep 2015