ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.05607
  4. Cited By
The elephant in the interpretability room: Why use attention as
  explanation when we have saliency methods?

The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?

12 October 2020
Jasmijn Bastings
Katja Filippova
    XAILRM
ArXiv (abs)PDFHTML

Papers citing "The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?"

50 / 99 papers shown
Concise and Sufficient Sub-Sentence Citations for Retrieval-Augmented Generation
Concise and Sufficient Sub-Sentence Citations for Retrieval-Augmented Generation
Guo Chen
Qiuyuan Li
Qiuxian Li
Hongliang Dai
Xiang Chen
Piji Li
3DVHILM
199
0
0
25 Sep 2025
Cross-Attention is Half Explanation in Speech-to-Text Models
Cross-Attention is Half Explanation in Speech-to-Text Models
Sara Papi
Dennis Fucci
Marco Gaido
Matteo Negri
L. Bentivogli
LRM
181
1
0
22 Sep 2025
SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought Reasoning
SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought Reasoning
Yue Xin
Chen Shen
Shaotian Yan
Xiaosong Yuan
Yaoming Wang
Xiaofeng Zhang
Chenxi Huang
Jieping Ye
ReLMLRM
146
0
0
20 Sep 2025
Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs
Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs
Sayed Mohammad Vakilzadeh Hatefi
Maximilian Dreyer
Reduan Achtibat
Patrick Kahardipraja
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
269
2
0
16 Jun 2025
On the reliability of feature attribution methods for speech classification
On the reliability of feature attribution methods for speech classification
Gaofei Shen
Hosein Mohebbi
Arianna Bisazza
Afra Alishahi
Grzegorz Chrupała
424
0
0
22 May 2025
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation
Patrick Kahardipraja
Reduan Achtibat
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
388
4
0
21 May 2025
Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation
Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented GenerationAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025
Yuhao Wang
Ruiyang Ren
Yucheng Wang
Wayne Xin Zhao
Jing Liu
Hua Wu
Haifeng Wang
232
1
0
17 May 2025
Enabling Global, Human-Centered Explanations for LLMs:From Tokens to Interpretable Code and Test Generation
Enabling Global, Human-Centered Explanations for LLMs:From Tokens to Interpretable Code and Test Generation
David Nader-Palacio
Dipin Khati
Daniel Rodríguez-Cárdenas
Alejandro Velasco
Denys Poshyvanyk
LRM
338
5
0
21 Mar 2025
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based ExplanationInternational Conference on Applications of Natural Language to Data Bases (NLDB), 2025
Duc Hau Nguyen
Cyrielle Mallart
Guillaume Gravier
Pascale Sébillot
291
1
0
22 Jan 2025
Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability
Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution ExplainabilityAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Joakim Edin
Andreas Geert Motzfeldt
Casper L. Christensen
Tuukka Ruotsalo
Lars Maaløe
Maria Maistro
485
6
0
15 Aug 2024
Validating Mechanistic Interpretations: An Axiomatic Approach
Validating Mechanistic Interpretations: An Axiomatic Approach
Nils Palumbo
Ravi Mangal
Zifan Wang
Saranya Vijayakumar
Corina S. Pasareanu
Somesh Jha
314
1
0
18 Jul 2024
A look under the hood of the Interactive Deep Learning Enterprise
  (No-IDLE)
A look under the hood of the Interactive Deep Learning Enterprise (No-IDLE)
Daniel Sonntag
Michael Barz
Thiago S. Gouvêa
VLM
313
6
0
27 Jun 2024
Interpretability Needs a New Paradigm
Interpretability Needs a New Paradigm
Andreas Madsen
Himabindu Lakkaraju
Siva Reddy
Sarath Chandar
211
7
0
08 May 2024
Unraveling the Dilemma of AI Errors: Exploring the Effectiveness of
  Human and Machine Explanations for Large Language Models
Unraveling the Dilemma of AI Errors: Exploring the Effectiveness of Human and Machine Explanations for Large Language Models
Marvin Pafla
Kate Larson
Mark Hancock
230
7
0
11 Apr 2024
LM Transparency Tool: Interactive Tool for Analyzing Transformer
  Language Models
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models
Igor Tufanov
Karen Hambardzumyan
Javier Ferrando
Elena Voita
KELM
267
15
0
10 Apr 2024
On the Faithfulness of Vision Transformer Explanations
On the Faithfulness of Vision Transformer Explanations
Junyi Wu
Weitai Kang
Hao Tang
Yuan Hong
Yan Yan
294
12
0
01 Apr 2024
Towards Explainability in Legal Outcome Prediction Models
Towards Explainability in Legal Outcome Prediction Models
Josef Valvoda
Robert Bamler
ELMAILaw
321
8
0
25 Mar 2024
Comparing Explanation Faithfulness between Multilingual and Monolingual
  Fine-tuned Language Models
Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models
Zhixue Zhao
Nikolaos Aletras
259
9
0
19 Mar 2024
Detecting Hallucination and Coverage Errors in Retrieval Augmented
  Generation for Controversial Topics
Detecting Hallucination and Coverage Errors in Retrieval Augmented Generation for Controversial TopicsInternational Conference on Language Resources and Evaluation (LREC), 2024
Tyler A. Chang
Katrin Tomanek
Jessica Hoffmann
Nithum Thain
Erin van Liemt
Kathleen Meier-Hellstern
Lucas Dixon
316
12
0
13 Mar 2024
Information Flow Routes: Automatically Interpreting Language Models at
  Scale
Information Flow Routes: Automatically Interpreting Language Models at Scale
Javier Ferrando
Elena Voita
395
74
0
27 Feb 2024
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
Attention Meets Post-hoc Interpretability: A Mathematical PerspectiveInternational Conference on Machine Learning (ICML), 2024
Gianluigi Lopardo
F. Precioso
Damien Garreau
267
14
0
05 Feb 2024
Approximate Attributions for Off-the-Shelf Siamese Transformers
Approximate Attributions for Off-the-Shelf Siamese TransformersConference of the European Chapter of the Association for Computational Linguistics (EACL), 2024
Lucas Moller
Dmitry Nikolaev
Sebastian Padó
268
7
0
05 Feb 2024
ReAGent: A Model-agnostic Feature Attribution Method for Generative
  Language Models
ReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models
Zhixue Zhao
Boxuan Shan
343
12
0
01 Feb 2024
XAI for In-hospital Mortality Prediction via Multimodal ICU Data
XAI for In-hospital Mortality Prediction via Multimodal ICU Data
Xingqiao Li
Jindong Gu
Zhiyong Wang
Yancheng Yuan
Bo Du
Fengxiang He
189
3
0
29 Dec 2023
Attribution and Alignment: Effects of Local Context Repetition on
  Utterance Production and Comprehension in Dialogue
Attribution and Alignment: Effects of Local Context Repetition on Utterance Production and Comprehension in DialogueConference on Computational Natural Language Learning (CoNLL), 2023
Aron Molnar
Jaap Jumelet
Mario Giulianelli
Arabella J. Sinclair
259
2
0
21 Nov 2023
Sum-of-Parts: Self-Attributing Neural Networks with End-to-End Learning of Feature Groups
Sum-of-Parts: Self-Attributing Neural Networks with End-to-End Learning of Feature Groups
Weiqiu You
Helen Qu
Marco Gatti
Bhuvnesh Jain
Eric Wong
FAttFaML
439
4
0
25 Oct 2023
REFER: An End-to-end Rationale Extraction Framework for Explanation
  Regularization
REFER: An End-to-end Rationale Extraction Framework for Explanation RegularizationConference on Computational Natural Language Learning (CoNLL), 2023
Mohammad Reza Ghasemi Madani
Pasquale Minervini
246
5
0
22 Oct 2023
An Interpretable Deep-Learning Framework for Predicting Hospital Readmissions From Electronic Health Records
An Interpretable Deep-Learning Framework for Predicting Hospital Readmissions From Electronic Health Records
Fabio Azzalini
Tommaso Dolci
Marco Vagaggini
OOD
402
2
0
16 Oct 2023
An Attribution Method for Siamese Encoders
An Attribution Method for Siamese EncodersConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Lucas Moller
Dmitry Nikolaev
Sebastian Padó
408
7
0
09 Oct 2023
Quantifying the Plausibility of Context Reliance in Neural Machine
  Translation
Quantifying the Plausibility of Context Reliance in Neural Machine TranslationInternational Conference on Learning Representations (ICLR), 2023
Gabriele Sarti
Grzegorz Chrupala
Malvina Nissim
Arianna Bisazza
316
6
0
02 Oct 2023
Attention Sorting Combats Recency Bias In Long Context Language Models
Attention Sorting Combats Recency Bias In Long Context Language Models
A. Peysakhovich
Adam Lerer
LRMRALM
339
86
0
28 Sep 2023
Exploring Different Levels of Supervision for Detecting and Localizing
  Solar Panels on Remote Sensing Imagery
Exploring Different Levels of Supervision for Detecting and Localizing Solar Panels on Remote Sensing Imagery
Maarten Burger
R. Wijnhoven
Shaodi You
207
1
0
19 Sep 2023
Unsupervised Text Style Transfer with Deep Generative Models
Unsupervised Text Style Transfer with Deep Generative Models
Zhongtao Jiang
Yuanzhe Zhang
Yiming Ju
Kang Liu
282
0
0
31 Aug 2023
Decoding Layer Saliency in Language Transformers
Decoding Layer Saliency in Language TransformersInternational Conference on Machine Learning (ICML), 2023
Elizabeth M. Hou
Greg Castañón
MILM
291
4
0
09 Aug 2023
ALens: An Adaptive Domain-Oriented Abstract Writing Training Tool for
  Novice Researchers
ALens: An Adaptive Domain-Oriented Abstract Writing Training Tool for Novice Researchers
Chen Cheng
Ziang Li
Zhenhui Peng
Quan Li
246
1
0
08 Aug 2023
Did the Models Understand Documents? Benchmarking Models for Language
  Understanding in Document-Level Relation Extraction
Did the Models Understand Documents? Benchmarking Models for Language Understanding in Document-Level Relation ExtractionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Haotian Chen
Bingsheng Chen
Xiangdong Zhou
282
8
0
20 Jun 2023
B-cos Alignment for Inherently Interpretable CNNs and Vision
  Transformers
B-cos Alignment for Inherently Interpretable CNNs and Vision TransformersIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Moritz D Boehle
Navdeeppal Singh
Mario Fritz
Bernt Schiele
405
40
0
19 Jun 2023
Using Sequences of Life-events to Predict Human Lives
Using Sequences of Life-events to Predict Human LivesNature Computational Science (Nat. Comput. Sci.), 2023
Germans Savcisens
Tina Eliassi-Rad
L. K. Hansen
L. Mortensen
Lau Lilleholt
Anna Rogers
Ingo Zettler
Sune Lehmann
AI4TS
259
78
0
05 Jun 2023
DecompX: Explaining Transformers Decisions by Propagating Token
  Decomposition
DecompX: Explaining Transformers Decisions by Propagating Token DecompositionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Ali Modarressi
Mohsen Fayyaz
Ehsan Aghazadeh
Yadollah Yaghoobzadeh
Mohammad Taher Pilehvar
324
37
0
05 Jun 2023
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
Q. V. Liao
J. Vaughan
413
244
0
02 Jun 2023
HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination
  and Omission Detection in Machine Translation
HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
David Dale
Elena Voita
Janice Lam
Prangthip Hansanti
C. Ropers
Elahe Kalbassi
Cynthia Gao
Loïc Barrault
Marta R. Costa-jussá
HILM
397
37
0
19 May 2023
Incorporating Attribution Importance for Improving Faithfulness Metrics
Incorporating Attribution Importance for Improving Faithfulness MetricsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhixue Zhao
Nikolaos Aletras
407
16
0
17 May 2023
AD-KD: Attribution-Driven Knowledge Distillation for Language Model
  Compression
AD-KD: Attribution-Driven Knowledge Distillation for Language Model CompressionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Siyue Wu
Hongzhan Chen
Xiaojun Quan
Qifan Wang
Rui Wang
VLM
412
31
0
17 May 2023
ConvXAI: Delivering Heterogeneous AI Explanations via Conversations to
  Support Human-AI Scientific Writing
ConvXAI: Delivering Heterogeneous AI Explanations via Conversations to Support Human-AI Scientific Writing
Hua Shen
Huang Chieh-Yang
Tongshuang Wu
Ting-Hao 'Kenneth' Huang
506
46
0
16 May 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language
  Models
Dissecting Recall of Factual Associations in Auto-Regressive Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
843
438
0
28 Apr 2023
Evaluating self-attention interpretability through human-grounded
  experimental protocol
Evaluating self-attention interpretability through human-grounded experimental protocol
Milan Bhan
Nina Achache
Victor Legrand
A. Blangero
Nicolas Chesneau
193
12
0
27 Mar 2023
Holistically Explainable Vision Transformers
Holistically Explainable Vision Transformers
Moritz D Boehle
Mario Fritz
Bernt Schiele
ViT
313
10
0
20 Jan 2023
Opti-CAM: Optimizing saliency maps for interpretability
Opti-CAM: Optimizing saliency maps for interpretabilityComputer Vision and Image Understanding (CVIU), 2023
Hanwei Zhang
Felipe Torres
R. Sicre
Yannis Avrithis
Stéphane Ayache
563
45
0
17 Jan 2023
DExT: Detector Explanation Toolkit
DExT: Detector Explanation Toolkit
Deepan Padmanabhan
Paul G. Plöger
Octavio Arriaga
Matias Valdenegro-Toro
221
2
0
21 Dec 2022
Human-Guided Fair Classification for Natural Language Processing
Human-Guided Fair Classification for Natural Language ProcessingInternational Conference on Learning Representations (ICLR), 2022
Florian E.Dorner
Momchil Peychev
Nikola Konstantinov
Naman Goel
Elliott Ash
Martin Vechev
FaML
296
7
0
20 Dec 2022
12
Next
Page 1 of 2