ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.07913
  4. Cited By
Learning to Deceive with Attention-Based Explanations
v1v2 (latest)

Learning to Deceive with Attention-Based Explanations

Annual Meeting of the Association for Computational Linguistics (ACL), 2019
17 September 2019
Danish Pruthi
Mansi Gupta
Bhuwan Dhingra
Graham Neubig
Zachary Chase Lipton
ArXiv (abs)PDFHTML

Papers citing "Learning to Deceive with Attention-Based Explanations"

50 / 109 papers shown
Bridging Fairness and Explainability: Can Input-Based Explanations Promote Fairness in Hate Speech Detection?
Bridging Fairness and Explainability: Can Input-Based Explanations Promote Fairness in Hate Speech Detection?
Yifan Wang
Mayank Jobanputra
Ji-Ung Lee
Soyoung Oh
Isabel Valera
Vera Demberg
281
1
0
26 Sep 2025
Follow the Flow: On Information Flow Across Textual Tokens in Text-to-Image Models
Follow the Flow: On Information Flow Across Textual Tokens in Text-to-Image Models
Guy Kaplan
Michael Toker
Yuval Reif
Yonatan Belinkov
Roy Schwartz
DiffM
508
2
0
01 Apr 2025
B-cos LM: Efficiently Transforming Pre-trained Language Models for Improved Explainability
B-cos LM: Efficiently Transforming Pre-trained Language Models for Improved Explainability
Yifan Wang
Sukrut Rao
Ji-Ung Lee
Mayank Jobanputra
Vera Demberg
337
0
0
18 Feb 2025
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based ExplanationInternational Conference on Applications of Natural Language to Data Bases (NLDB), 2025
Duc Hau Nguyen
Cyrielle Mallart
Guillaume Gravier
Pascale Sébillot
345
1
0
22 Jan 2025
Explanation Regularisation through the Lens of Attributions
Explanation Regularisation through the Lens of Attributions
Pedro Ferreira
Wilker Aziz
Ivan Titov
611
2
0
23 Jul 2024
They Look Like Each Other: Case-based Reasoning for Explainable
  Depression Detection on Twitter using Large Language Models
They Look Like Each Other: Case-based Reasoning for Explainable Depression Detection on Twitter using Large Language Models
Mohammad Saeid Mahdavinejad
Peyman Adibi
A. Monadjemi
Pascal Hitzler
350
1
0
21 Jul 2024
Validating Mechanistic Interpretations: An Axiomatic Approach
Validating Mechanistic Interpretations: An Axiomatic Approach
Nils Palumbo
Ravi Mangal
Zifan Wang
Saranya Vijayakumar
Corina S. Pasareanu
Somesh Jha
378
1
0
18 Jul 2024
InternalInspector $I^2$: Robust Confidence Estimation in LLMs through
  Internal States
InternalInspector I2I^2I2: Robust Confidence Estimation in LLMs through Internal States
Mohammad Beigi
Ying Shen
Runing Yang
Zihao Lin
Qifan Wang
Ankith Mohan
Jianfeng He
Ming Jin
Chang-Tien Lu
Lifu Huang
HILM
300
23
0
17 Jun 2024
PEACH: Pretrained-embedding Explanation Across Contextual and
  Hierarchical Structure
PEACH: Pretrained-embedding Explanation Across Contextual and Hierarchical Structure
Feiqi Cao
S. Han
Hyunsuk Chung
334
0
0
21 Apr 2024
Towards a Framework for Evaluating Explanations in Automated Fact
  Verification
Towards a Framework for Evaluating Explanations in Automated Fact Verification
Neema Kotonya
Francesca Toni
326
9
0
29 Mar 2024
From Explainable to Interpretable Deep Learning for Natural Language
  Processing in Healthcare: How Far from Reality?
From Explainable to Interpretable Deep Learning for Natural Language Processing in Healthcare: How Far from Reality?Computational and Structural Biotechnology Journal (CSBJ), 2024
Guangming Huang
Yingya Li
Shoaib Jameel
Yunfei Long
G. Papanastasiou
350
47
0
18 Mar 2024
RORA: Robust Free-Text Rationale Evaluation
RORA: Robust Free-Text Rationale Evaluation
Zhengping Jiang
Yining Lu
Hanjie Chen
Daniel Khashabi
Benjamin Van Durme
Anqi Liu
315
7
0
28 Feb 2024
CMA-R:Causal Mediation Analysis for Explaining Rumour Detection
CMA-R:Causal Mediation Analysis for Explaining Rumour Detection
Lin Tian
Xiuzhen Zhang
Jey Han Lau
317
0
0
13 Feb 2024
SoK: Taming the Triangle -- On the Interplays between Fairness,
  Interpretability and Privacy in Machine Learning
SoK: Taming the Triangle -- On the Interplays between Fairness, Interpretability and Privacy in Machine Learning
Julien Ferry
Ulrich Aïvodji
Sébastien Gambs
Marie-José Huguet
Mohamed Siala
FaML
362
7
0
22 Dec 2023
Interpretability Illusions in the Generalization of Simplified Models
Interpretability Illusions in the Generalization of Simplified Models
Dan Friedman
Andrew Kyle Lampinen
Lucas Dixon
Danqi Chen
Asma Ghandeharioun
399
20
0
06 Dec 2023
How Well Do Feature-Additive Explainers Explain Feature-Additive
  Predictors?
How Well Do Feature-Additive Explainers Explain Feature-Additive Predictors?
Zachariah Carmichael
Walter J. Scheirer
FAtt
304
9
0
27 Oct 2023
REFER: An End-to-end Rationale Extraction Framework for Explanation
  Regularization
REFER: An End-to-end Rationale Extraction Framework for Explanation RegularizationConference on Computational Natural Language Learning (CoNLL), 2023
Mohammad Reza Ghasemi Madani
Pasquale Minervini
310
5
0
22 Oct 2023
Make Your Decision Convincing! A Unified Two-Stage Framework:
  Self-Attribution and Decision-Making
Make Your Decision Convincing! A Unified Two-Stage Framework: Self-Attribution and Decision-Making
Yanrui Du
Sendong Zhao
Hao Wang
Yuhan Chen
Rui Bai
Zewen Qiang
Muzhen Cai
Bing Qin
205
1
0
20 Oct 2023
Why bother with geometry? On the relevance of linear decompositions of
  Transformer embeddings
Why bother with geometry? On the relevance of linear decompositions of Transformer embeddingsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023
Timothee Mickus
Ananda Sreenidhi
257
3
0
10 Oct 2023
Evaluating Explanation Methods for Vision-and-Language Navigation
Evaluating Explanation Methods for Vision-and-Language NavigationEuropean Conference on Artificial Intelligence (ECAI), 2023
Guanqi Chen
Lei Yang
Guanhua Chen
Jia Pan
XAI
290
1
0
10 Oct 2023
Towards Better Chain-of-Thought Prompting Strategies: A Survey
Towards Better Chain-of-Thought Prompting Strategies: A Survey
Zihan Yu
Liang He
Zhen Wu
Xinyu Dai
Jiajun Chen
LRM
502
90
0
08 Oct 2023
ViT-ReciproCAM: Gradient and Attention-Free Visual Explanations for
  Vision Transformer
ViT-ReciproCAM: Gradient and Attention-Free Visual Explanations for Vision Transformer
Seokhyun Byun
Won-Jo Lee
FAtt
261
10
0
04 Oct 2023
Goodhart's Law Applies to NLP's Explanation Benchmarks
Goodhart's Law Applies to NLP's Explanation BenchmarksFindings (Findings), 2023
Jennifer Hsia
Danish Pruthi
Aarti Singh
Zachary Chase Lipton
259
8
0
28 Aug 2023
Decoding Layer Saliency in Language Transformers
Decoding Layer Saliency in Language TransformersInternational Conference on Machine Learning (ICML), 2023
Elizabeth M. Hou
Greg Castañón
MILM
342
4
0
09 Aug 2023
R-Cut: Enhancing Explainability in Vision Transformers with Relationship
  Weighted Out and Cut
R-Cut: Enhancing Explainability in Vision Transformers with Relationship Weighted Out and CutItalian National Conference on Sensors (INS), 2023
Yingjie Niu
Ming Ding
Maoning Ge
Robin Karlsson
Yuxiao Zhang
K. Takeda
ViT
184
6
0
18 Jul 2023
A Novel Counterfactual Data Augmentation Method for Aspect-Based
  Sentiment Analysis
A Novel Counterfactual Data Augmentation Method for Aspect-Based Sentiment AnalysisAsian Conference on Machine Learning (ACML), 2023
Dongming Wu
Lulu Wen
Chao Chen
Zhaoshu Shi
252
6
0
20 Jun 2023
Genomic Interpreter: A Hierarchical Genomic Deep Neural Network with 1D
  Shifted Window Transformer
Genomic Interpreter: A Hierarchical Genomic Deep Neural Network with 1D Shifted Window Transformer
Zehui Li
Akashaditya Das
W. Beardall
Yiren Zhao
Guy-Bart Stan
272
6
0
08 Jun 2023
Robust Natural Language Understanding with Residual Attention Debiasing
Robust Natural Language Understanding with Residual Attention DebiasingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Fei Wang
James Y. Huang
Tianyi Yan
Wenxuan Zhou
Muhao Chen
202
13
0
28 May 2023
Explaining How Transformers Use Context to Build Predictions
Explaining How Transformers Use Context to Build PredictionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Javier Ferrando
Gerard I. Gállego
Ioannis Tsiamas
Marta R. Costa-jussá
196
54
0
21 May 2023
COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable
  ELements for explaining neural net classifiers on NLP tasks
COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable ELements for explaining neural net classifiers on NLP tasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Fanny Jourdan
Agustin Picard
Thomas Fel
Laurent Risser
Jean-Michel Loubes
Nicholas M. Asher
295
17
0
11 May 2023
Faithful Chain-of-Thought Reasoning
Faithful Chain-of-Thought ReasoningInternational Joint Conference on Natural Language Processing (IJCNLP), 2023
Qing Lyu
Shreya Havaldar
Adam Stein
Li Zhang
D. Rao
Eric Wong
Marianna Apidianaki
Chris Callison-Burch
ReLMLRM
640
366
0
31 Jan 2023
Tensions Between the Proxies of Human Values in AI
Tensions Between the Proxies of Human Values in AI
Teresa Datta
D. Nissani
Max Cembalest
Akash Khanna
Haley Massa
John P. Dickerson
243
4
0
14 Dec 2022
MEGAN: Multi-Explanation Graph Attention Network
MEGAN: Multi-Explanation Graph Attention Network
Jonas Teufel
Luca Torresi
Patrick Reiser
Pascal Friederich
232
9
0
23 Nov 2022
ViT-CX: Causal Explanation of Vision Transformers
ViT-CX: Causal Explanation of Vision TransformersInternational Joint Conference on Artificial Intelligence (IJCAI), 2022
Weiyan Xie
Xiao-hui Li
Caleb Chen Cao
Nevin L.Zhang
ViT
429
38
0
06 Nov 2022
Salience Allocation as Guidance for Abstractive Summarization
Salience Allocation as Guidance for Abstractive SummarizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Fei Wang
Kaiqiang Song
Hongming Zhang
Lifeng Jin
Sangwoo Cho
Wenlin Yao
Xiaoyang Wang
Muhao Chen
Dong Yu
206
43
0
22 Oct 2022
Beyond Model Interpretability: On the Faithfulness and Adversarial
  Robustness of Contrastive Textual Explanations
Beyond Model Interpretability: On the Faithfulness and Adversarial Robustness of Contrastive Textual ExplanationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Julia El Zini
M. Awad
AAML
237
2
0
17 Oct 2022
StyLEx: Explaining Style Using Human Lexical Annotations
StyLEx: Explaining Style Using Human Lexical AnnotationsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Shirley Anugrah Hayati
Kyumin Park
Dheeraj Rajagopal
Lyle Ungar
Luan Tuyen Chau
436
3
0
14 Oct 2022
On the Explainability of Natural Language Processing Deep Models
On the Explainability of Natural Language Processing Deep ModelsACM Computing Surveys (ACM CSUR), 2022
Julia El Zini
M. Awad
312
116
0
13 Oct 2022
Explanations, Fairness, and Appropriate Reliance in Human-AI
  Decision-Making
Explanations, Fairness, and Appropriate Reliance in Human-AI Decision-MakingInternational Conference on Human Factors in Computing Systems (CHI), 2022
Jakob Schoeffer
Maria De-Arteaga
Niklas Kuehl
FaML
552
84
0
23 Sep 2022
Towards Faithful Model Explanation in NLP: A Survey
Towards Faithful Model Explanation in NLP: A SurveyComputational Linguistics (CL), 2022
Qing Lyu
Marianna Apidianaki
Chris Callison-Burch
XAI
639
189
0
22 Sep 2022
Looking for a Needle in a Haystack: A Comprehensive Study of
  Hallucinations in Neural Machine Translation
Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine TranslationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Nuno M. Guerreiro
Elena Voita
André F. T. Martins
HILM
364
70
0
10 Aug 2022
Interpretable by Design: Learning Predictors by Composing Interpretable
  Queries
Interpretable by Design: Learning Predictors by Composing Interpretable QueriesIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Aditya Chattopadhyay
Stewart Slocum
B. Haeffele
René Vidal
D. Geman
307
33
0
03 Jul 2022
How to Dissect a Muppet: The Structure of Transformer Embedding Spaces
How to Dissect a Muppet: The Structure of Transformer Embedding SpacesTransactions of the Association for Computational Linguistics (TACL), 2022
Timothee Mickus
Denis Paperno
Mathieu Constant
313
29
0
07 Jun 2022
On the Relationship Between Explanations, Fairness Perceptions, and
  Decisions
On the Relationship Between Explanations, Fairness Perceptions, and Decisions
Jakob Schoeffer
Maria De-Arteaga
Niklas Kuehl
FaML
305
7
0
27 Apr 2022
Grad-SAM: Explaining Transformers via Gradient Self-Attention Maps
Grad-SAM: Explaining Transformers via Gradient Self-Attention MapsInternational Conference on Information and Knowledge Management (CIKM), 2021
Oren Barkan
Edan Hauon
Avi Caciularu
Ori Katz
Itzik Malkiel
Omri Armstrong
Noam Koenigstein
275
62
0
23 Apr 2022
The Risks of Machine Learning Systems
The Risks of Machine Learning Systems
Samson Tan
Araz Taeihagh
K. Baxter
170
9
0
21 Apr 2022
ProtoTEx: Explaining Model Decisions with Prototype Tensors
ProtoTEx: Explaining Model Decisions with Prototype TensorsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Anubrata Das
Chitrank Gupta
Venelin Kovatchev
Matthew Lease
Junjie Li
234
33
0
11 Apr 2022
Interpretation of Black Box NLP Models: A Survey
Interpretation of Black Box NLP Models: A Survey
Shivani Choudhary
N. Chatterjee
S. K. Saha
FAtt
255
19
0
31 Mar 2022
Measuring the Mixing of Contextual Information in the Transformer
Measuring the Mixing of Contextual Information in the TransformerConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
374
74
0
08 Mar 2022
Hierarchical Interpretation of Neural Text Classification
Hierarchical Interpretation of Neural Text ClassificationComputational Linguistics (CL), 2022
Hanqi Yan
Lin Gui
Yulan He
396
17
0
20 Feb 2022
123
Next
Page 1 of 3