Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.08137
Cited By
Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability
15 August 2024
Joakim Edin
Andreas Geert Motzfeldt
Casper L. Christensen
Tuukka Ruotsalo
Lars Maaløe
Maria Maistro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability"
35 / 35 papers shown
Title
GIM: Improved Interpretability for Large Language Models
Joakim Edin
Róbert Csordás
Tuukka Ruotsalo
Zhengxuan Wu
Maria Maistro
Jing-ling Huang
Lars Maaløe
26
0
0
23 May 2025
Fool Me Once? Contrasting Textual and Visual Explanations in a Clinical Decision-Support Setting
Maxime Kayser
Bayar I. Menzat
Cornelius Emde
Bogdan Bercean
Alex Novak
Abdala Espinosa
B. Papież
Susanne Gaube
Thomas Lukasiewicz
Oana-Maria Camburu
67
4
0
16 Oct 2024
An Unsupervised Approach to Achieve Supervised-Level Explainability in Healthcare Records
Joakim Edin
Maria Maistro
Lars Maaløe
Lasse Borgholt
Jakob Drachmann Havtorn
Tuukka Ruotsalo
FAtt
53
5
0
13 Jun 2024
Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales
Lucas Resck
Marcos M. Raimundo
Jorge Poco
69
3
0
03 Apr 2024
Towards Faithful Explanations for Text Classification with Robustness Improvement and Explanation Guided Training
Dongfang Li
Baotian Hu
Qingcai Chen
Shan He
47
5
0
29 Dec 2023
DecompX: Explaining Transformers Decisions by Propagating Token Decomposition
Ali Modarressi
Mohsen Fayyaz
Ehsan Aghazadeh
Yadollah Yaghoobzadeh
Mohammad Taher Pilehvar
57
27
0
05 Jun 2023
Incorporating Attribution Importance for Improving Faithfulness Metrics
Zhixue Zhao
Nikolaos Aletras
45
13
0
17 May 2023
EvalAttAI: A Holistic Approach to Evaluating Attribution Maps in Robust and Non-Robust Models
Ian E. Nielsen
Ravichandran Ramachandran
N. Bouaynaya
Hassan M. Fathallah-Shaykh
Ghulam Rasool
AAML
FAtt
67
8
0
15 Mar 2023
Improving Interpretability via Explicit Word Interaction Graph Layer
Arshdeep Sekhon
Hanjie Chen
A. Shrivastava
Zhe Wang
Yangfeng Ji
Yanjun Qi
AI4CE
MILM
46
6
0
03 Feb 2023
Towards Faithful Model Explanation in NLP: A Survey
Qing Lyu
Marianna Apidianaki
Chris Callison-Burch
XAI
122
115
0
22 Sep 2022
The Solvability of Interpretability Evaluation Metrics
Yilun Zhou
J. Shah
78
8
0
18 May 2022
An Empirical Study on Explanations in Out-of-Domain Settings
G. Chrysostomou
Nikolaos Aletras
LRM
17
28
0
28 Feb 2022
Logic Traps in Evaluating Attribution Scores
Yiming Ju
Yuanzhe Zhang
Zhao Yang
Zhongtao Jiang
Kang Liu
Jun Zhao
XAI
FAtt
43
18
0
12 Sep 2021
Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience
G. Chrysostomou
Nikolaos Aletras
42
17
0
31 Aug 2021
The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations
Peter Hase
Harry Xie
Joey Tianyi Zhou
OODD
LRM
FAtt
49
91
0
01 Jun 2021
On the Sensitivity and Stability of Model Interpretations in NLP
Fan Yin
Zhouxing Shi
Cho-Jui Hsieh
Kai-Wei Chang
FAtt
27
33
0
18 Apr 2021
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?
Jasmijn Bastings
Katja Filippova
XAI
LRM
64
177
0
12 Oct 2020
Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers
Hanjie Chen
Yangfeng Ji
AAML
VLM
46
63
0
01 Oct 2020
How does this interaction affect me? Interpretable attribution for feature interactions
Michael Tsang
Sirisha Rambhatla
Yan Liu
FAtt
35
87
0
19 Jun 2020
Evaluating and Aggregating Feature-based Model Explanations
Umang Bhatt
Adrian Weller
J. M. F. Moura
XAI
67
218
0
01 May 2020
Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?
Alon Jacovi
Yoav Goldberg
XAI
53
584
0
07 Apr 2020
ERASER: A Benchmark to Evaluate Rationalized NLP Models
Jay DeYoung
Sarthak Jain
Nazneen Rajani
Eric P. Lehman
Caiming Xiong
R. Socher
Byron C. Wallace
57
631
0
08 Nov 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
34
7,386
0
02 Oct 2019
One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques
Vijay Arya
Rachel K. E. Bellamy
Pin-Yu Chen
Amit Dhurandhar
Michael Hind
...
Karthikeyan Shanmugam
Moninder Singh
Kush R. Varshney
Dennis L. Wei
Yunfeng Zhang
XAI
13
391
0
06 Sep 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
289
24,058
0
26 Jul 2019
Attention is not Explanation
Sarthak Jain
Byron C. Wallace
FAtt
64
1,307
0
26 Feb 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
294
93,660
0
11 Oct 2018
A Benchmark for Interpretability Methods in Deep Neural Networks
Sara Hooker
D. Erhan
Pieter-Jan Kindermans
Been Kim
FAtt
UQCV
52
673
0
28 Jun 2018
Pathologies of Neural Models Make Interpretations Difficult
Shi Feng
Eric Wallace
Alvin Grissom II
Mohit Iyyer
Pedro Rodriguez
Jordan L. Boyd-Graber
AAML
FAtt
26
317
0
20 Apr 2018
A Unified Approach to Interpreting Model Predictions
Scott M. Lundberg
Su-In Lee
FAtt
34
21,459
0
22 May 2017
Learning Important Features Through Propagating Activation Differences
Avanti Shrikumar
Peyton Greenside
A. Kundaje
FAtt
42
3,848
0
10 Apr 2017
Axiomatic Attribution for Deep Networks
Mukund Sundararajan
Ankur Taly
Qiqi Yan
OOD
FAtt
52
5,894
0
04 Mar 2017
Beam Search Strategies for Neural Machine Translation
Markus Freitag
Yaser Al-Onaizan
40
381
0
06 Feb 2017
Evaluating the visualization of what a Deep Neural Network has learned
Wojciech Samek
Alexander Binder
G. Montavon
Sebastian Lapuschkin
K. Müller
XAI
81
1,188
0
21 Sep 2015
Character-level Convolutional Networks for Text Classification
Xiang Zhang
Jiaqi Zhao
Yann LeCun
142
6,046
0
04 Sep 2015
1