Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2010.05607
Cited By
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?
12 October 2020
Jasmijn Bastings
Katja Filippova
XAI
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?"
50 / 99 papers shown
Concise and Sufficient Sub-Sentence Citations for Retrieval-Augmented Generation
Guo Chen
Qiuyuan Li
Qiuxian Li
Hongliang Dai
Xiang Chen
Piji Li
3DV
HILM
199
0
0
25 Sep 2025
Cross-Attention is Half Explanation in Speech-to-Text Models
Sara Papi
Dennis Fucci
Marco Gaido
Matteo Negri
L. Bentivogli
LRM
181
1
0
22 Sep 2025
SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought Reasoning
Yue Xin
Chen Shen
Shaotian Yan
Xiaosong Yuan
Yaoming Wang
Xiaofeng Zhang
Chenxi Huang
Jieping Ye
ReLM
LRM
146
0
0
20 Sep 2025
Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs
Sayed Mohammad Vakilzadeh Hatefi
Maximilian Dreyer
Reduan Achtibat
Patrick Kahardipraja
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
269
2
0
16 Jun 2025
On the reliability of feature attribution methods for speech classification
Gaofei Shen
Hosein Mohebbi
Arianna Bisazza
Afra Alishahi
Grzegorz Chrupała
424
0
0
22 May 2025
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation
Patrick Kahardipraja
Reduan Achtibat
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
388
4
0
21 May 2025
Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025
Yuhao Wang
Ruiyang Ren
Yucheng Wang
Wayne Xin Zhao
Jing Liu
Hua Wu
Haifeng Wang
232
1
0
17 May 2025
Enabling Global, Human-Centered Explanations for LLMs:From Tokens to Interpretable Code and Test Generation
David Nader-Palacio
Dipin Khati
Daniel Rodríguez-Cárdenas
Alejandro Velasco
Denys Poshyvanyk
LRM
338
5
0
21 Mar 2025
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation
International Conference on Applications of Natural Language to Data Bases (NLDB), 2025
Duc Hau Nguyen
Cyrielle Mallart
Guillaume Gravier
Pascale Sébillot
291
1
0
22 Jan 2025
Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Joakim Edin
Andreas Geert Motzfeldt
Casper L. Christensen
Tuukka Ruotsalo
Lars Maaløe
Maria Maistro
485
6
0
15 Aug 2024
Validating Mechanistic Interpretations: An Axiomatic Approach
Nils Palumbo
Ravi Mangal
Zifan Wang
Saranya Vijayakumar
Corina S. Pasareanu
Somesh Jha
314
1
0
18 Jul 2024
A look under the hood of the Interactive Deep Learning Enterprise (No-IDLE)
Daniel Sonntag
Michael Barz
Thiago S. Gouvêa
VLM
313
6
0
27 Jun 2024
Interpretability Needs a New Paradigm
Andreas Madsen
Himabindu Lakkaraju
Siva Reddy
Sarath Chandar
211
7
0
08 May 2024
Unraveling the Dilemma of AI Errors: Exploring the Effectiveness of Human and Machine Explanations for Large Language Models
Marvin Pafla
Kate Larson
Mark Hancock
230
7
0
11 Apr 2024
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models
Igor Tufanov
Karen Hambardzumyan
Javier Ferrando
Elena Voita
KELM
267
15
0
10 Apr 2024
On the Faithfulness of Vision Transformer Explanations
Junyi Wu
Weitai Kang
Hao Tang
Yuan Hong
Yan Yan
294
12
0
01 Apr 2024
Towards Explainability in Legal Outcome Prediction Models
Josef Valvoda
Robert Bamler
ELM
AILaw
321
8
0
25 Mar 2024
Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models
Zhixue Zhao
Nikolaos Aletras
259
9
0
19 Mar 2024
Detecting Hallucination and Coverage Errors in Retrieval Augmented Generation for Controversial Topics
International Conference on Language Resources and Evaluation (LREC), 2024
Tyler A. Chang
Katrin Tomanek
Jessica Hoffmann
Nithum Thain
Erin van Liemt
Kathleen Meier-Hellstern
Lucas Dixon
316
12
0
13 Mar 2024
Information Flow Routes: Automatically Interpreting Language Models at Scale
Javier Ferrando
Elena Voita
395
74
0
27 Feb 2024
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
International Conference on Machine Learning (ICML), 2024
Gianluigi Lopardo
F. Precioso
Damien Garreau
267
14
0
05 Feb 2024
Approximate Attributions for Off-the-Shelf Siamese Transformers
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2024
Lucas Moller
Dmitry Nikolaev
Sebastian Padó
268
7
0
05 Feb 2024
ReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models
Zhixue Zhao
Boxuan Shan
343
12
0
01 Feb 2024
XAI for In-hospital Mortality Prediction via Multimodal ICU Data
Xingqiao Li
Jindong Gu
Zhiyong Wang
Yancheng Yuan
Bo Du
Fengxiang He
189
3
0
29 Dec 2023
Attribution and Alignment: Effects of Local Context Repetition on Utterance Production and Comprehension in Dialogue
Conference on Computational Natural Language Learning (CoNLL), 2023
Aron Molnar
Jaap Jumelet
Mario Giulianelli
Arabella J. Sinclair
259
2
0
21 Nov 2023
Sum-of-Parts: Self-Attributing Neural Networks with End-to-End Learning of Feature Groups
Weiqiu You
Helen Qu
Marco Gatti
Bhuvnesh Jain
Eric Wong
FAtt
FaML
439
4
0
25 Oct 2023
REFER: An End-to-end Rationale Extraction Framework for Explanation Regularization
Conference on Computational Natural Language Learning (CoNLL), 2023
Mohammad Reza Ghasemi Madani
Pasquale Minervini
246
5
0
22 Oct 2023
An Interpretable Deep-Learning Framework for Predicting Hospital Readmissions From Electronic Health Records
Fabio Azzalini
Tommaso Dolci
Marco Vagaggini
OOD
402
2
0
16 Oct 2023
An Attribution Method for Siamese Encoders
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Lucas Moller
Dmitry Nikolaev
Sebastian Padó
408
7
0
09 Oct 2023
Quantifying the Plausibility of Context Reliance in Neural Machine Translation
International Conference on Learning Representations (ICLR), 2023
Gabriele Sarti
Grzegorz Chrupala
Malvina Nissim
Arianna Bisazza
316
6
0
02 Oct 2023
Attention Sorting Combats Recency Bias In Long Context Language Models
A. Peysakhovich
Adam Lerer
LRM
RALM
339
86
0
28 Sep 2023
Exploring Different Levels of Supervision for Detecting and Localizing Solar Panels on Remote Sensing Imagery
Maarten Burger
R. Wijnhoven
Shaodi You
207
1
0
19 Sep 2023
Unsupervised Text Style Transfer with Deep Generative Models
Zhongtao Jiang
Yuanzhe Zhang
Yiming Ju
Kang Liu
282
0
0
31 Aug 2023
Decoding Layer Saliency in Language Transformers
International Conference on Machine Learning (ICML), 2023
Elizabeth M. Hou
Greg Castañón
MILM
291
4
0
09 Aug 2023
ALens: An Adaptive Domain-Oriented Abstract Writing Training Tool for Novice Researchers
Chen Cheng
Ziang Li
Zhenhui Peng
Quan Li
246
1
0
08 Aug 2023
Did the Models Understand Documents? Benchmarking Models for Language Understanding in Document-Level Relation Extraction
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Haotian Chen
Bingsheng Chen
Xiangdong Zhou
282
8
0
20 Jun 2023
B-cos Alignment for Inherently Interpretable CNNs and Vision Transformers
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Moritz D Boehle
Navdeeppal Singh
Mario Fritz
Bernt Schiele
405
40
0
19 Jun 2023
Using Sequences of Life-events to Predict Human Lives
Nature Computational Science (Nat. Comput. Sci.), 2023
Germans Savcisens
Tina Eliassi-Rad
L. K. Hansen
L. Mortensen
Lau Lilleholt
Anna Rogers
Ingo Zettler
Sune Lehmann
AI4TS
259
78
0
05 Jun 2023
DecompX: Explaining Transformers Decisions by Propagating Token Decomposition
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Ali Modarressi
Mohsen Fayyaz
Ehsan Aghazadeh
Yadollah Yaghoobzadeh
Mohammad Taher Pilehvar
324
37
0
05 Jun 2023
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
Q. V. Liao
J. Vaughan
413
244
0
02 Jun 2023
HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
David Dale
Elena Voita
Janice Lam
Prangthip Hansanti
C. Ropers
Elahe Kalbassi
Cynthia Gao
Loïc Barrault
Marta R. Costa-jussá
HILM
397
37
0
19 May 2023
Incorporating Attribution Importance for Improving Faithfulness Metrics
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhixue Zhao
Nikolaos Aletras
407
16
0
17 May 2023
AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Siyue Wu
Hongzhan Chen
Xiaojun Quan
Qifan Wang
Rui Wang
VLM
412
31
0
17 May 2023
ConvXAI: Delivering Heterogeneous AI Explanations via Conversations to Support Human-AI Scientific Writing
Hua Shen
Huang Chieh-Yang
Tongshuang Wu
Ting-Hao 'Kenneth' Huang
506
46
0
16 May 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
843
438
0
28 Apr 2023
Evaluating self-attention interpretability through human-grounded experimental protocol
Milan Bhan
Nina Achache
Victor Legrand
A. Blangero
Nicolas Chesneau
193
12
0
27 Mar 2023
Holistically Explainable Vision Transformers
Moritz D Boehle
Mario Fritz
Bernt Schiele
ViT
313
10
0
20 Jan 2023
Opti-CAM: Optimizing saliency maps for interpretability
Computer Vision and Image Understanding (CVIU), 2023
Hanwei Zhang
Felipe Torres
R. Sicre
Yannis Avrithis
Stéphane Ayache
563
45
0
17 Jan 2023
DExT: Detector Explanation Toolkit
Deepan Padmanabhan
Paul G. Plöger
Octavio Arriaga
Matias Valdenegro-Toro
221
2
0
21 Dec 2022
Human-Guided Fair Classification for Natural Language Processing
International Conference on Learning Representations (ICLR), 2022
Florian E.Dorner
Momchil Peychev
Nikola Konstantinov
Naman Goel
Elliott Ash
Martin Vechev
FaML
296
7
0
20 Dec 2022
1
2
Next
Page 1 of 2