v1v2 (latest)

Attention is not not Explanation

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019

13 August 2019

Papers citing "Attention is not not Explanation"

50 / 559 papers shown

Malicious Image Analysis via Vision-Language Segmentation Fusion: Detection, Element, and Location in One-shot

189

04 Dec 2025

A Self-explainable Model of Long Time Series by Extracting Informative Structured Causal Patterns

213

01 Dec 2025

Graphing the Truth: Structured Visualizations for Automated Hallucination Detection in LLMs

Tanmay Agrawal

HILM

358

29 Nov 2025

Quantifying Modality Contributions via Disentangling Multimodal Representations

149

22 Nov 2025

CID: Measuring Feature Importance Through Counterfactual Distributions

513

19 Nov 2025

Order-Level Attention Similarity Across Language Models: A Latent Commonality

166

07 Nov 2025

A Dual-Use Framework for Clinical Gait Analysis: Attention-Based Sensor Optimization and Automated Dataset Auditing

Hamidreza Sadeghsalehi

123

03 Nov 2025

A Video Is Not Worth a Thousand Words

Sam Pollard

Michael Wray

139

27 Oct 2025

When LRP Diverges from Leave-One-Out in Transformers

184

21 Oct 2025

EEGChaT: A Transformer-Based Modular Channel Selector for SEEG Analysis

115

15 Oct 2025

Discursive Circuits: How Do Language Models Understand Discourse Relations?

Yisong Miao

Min-Yen Kan

177

13 Oct 2025

Everyone prefers human writers, including AI

Wouter Haverals

Meredith Martin

145

09 Oct 2025

Introspection in Learned Semantic Scene Graph Localisation

Manshika Charvi Bissessur

Efimia Panagiotaki

Daniele De Martini

SSL

252

08 Oct 2025

DecEx-RAG: Boosting Agentic Retrieval-Augmented Generation with Decision and Execution Optimization via Process Supervision

125

07 Oct 2025

There is More to Attention: Statistical Filtering Enhances Explanations in Vision Transformers

Meghna P. Ayyar

Jenny Benois-Pineau

A. Zemmari

211

07 Oct 2025

Evaluation Framework for Highlight Explanations of Context Utilisation in Language Models

222

03 Oct 2025

AttentionDep: Domain-Aware Attention for Explainable Depression Severity Assessment

102

01 Oct 2025

Analyzing Latent Concepts in Code Language Models

345

01 Oct 2025

DeepProv: Behavioral Characterization and Repair of Neural Networks via Inference Provenance Graph Analysis

229

30 Sep 2025

Sparse Autoencoders Make Audio Foundation Models more Explainable

152

29 Sep 2025

TDHook: A Lightweight Framework for Interpretability

Yoann Poupart

AI4CE

190

29 Sep 2025

Explaining Fine Tuned LLMs via Counterfactuals A Knowledge Graph Driven Framework

229

25 Sep 2025

AIBA: Attention-based Instrument Band Alignment for Text-to-Audio Diffusion

262

25 Sep 2025

When Long Helps Short: How Context Length in Supervised Fine-tuning Affects Behavior of Large Language Models

302

23 Sep 2025

Cross-Attention is Half Explanation in Speech-to-Text Models

226

22 Sep 2025

ConceptViz: A Visual Analytics Approach for Exploring Concepts in Large Language Models

192

20 Sep 2025

Subject Matter Expertise vs Professional Management in Collective Sequential Decision Making

David Shoresh

Yonatan Loewenstein

18 Sep 2025

Copycat vs. Original: Multi-modal Pretraining and Variable Importance in Box-office Prediction

Qin Chao

Eunsoo Kim

Boyang Albert Li

172

18 Sep 2025

ORACLE: Explaining Feature Interactions in Neural Networks with ANOVA

Dongseok Kim

Wonjun Jeong

Mohamed Jismy Aashik Rasool

Gisung Oh

276

13 Sep 2025

Whisper Has an Internal Word Aligner

Sung-Lin Yeh

Yen Meng

Hao Tang

166

12 Sep 2025

An Autoencoder and Vision Transformer-based Interpretability Analysis of the Differences in Automated Staging of Second and Third Molars

207

12 Sep 2025

Do LLMs Adhere to Label Definitions? Examining Their Receptivity to External Label Definitions

Seyedali Mohammadi

Bhaskara Hanuma Vedula

Hemank Lamba

Edward Raff

Ponnurangam Kumaraguru

Francis Ferraro

Manas Gaur

221

02 Sep 2025

MindGuard: Intrinsic Decision Inspection for Securing LLM Agents Against Metadata Poisoning

400

28 Aug 2025

MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs

177

25 Aug 2025

On the effectiveness of multimodal privileged knowledge distillation in two vision transformer based diagnostic applications

Simon Baur

Alexandra Benova

Emilio Dolgener Cantú

Jackie Ma

MedIm

122

06 Aug 2025

User Perception of Attention Visualizations: Effects on Interpretability Across Evidence-Based Medical Documents

166

05 Aug 2025

AttnTrace: Attention-based Context Traceback for Long-Context LLMs

258

05 Aug 2025

Distilling Knowledge from Large Language Models: A Concept Bottleneck Model for Hate and Counter Speech Recognition

Roberto Labadie-Tamayo

Christiane Atzmüller Matthias Zeppelzauer

156

30 Jul 2025

Contrast-CAT: Contrasting Activations for Enhanced Interpretability in Transformer-based Text ClassifiersConference on Uncertainty in Artificial Intelligence (UAI), 2025

Sungmin Han

Jeonghyun Lee

Sangkyun Lee

301

27 Jul 2025

Interpretable Open-Vocabulary Referring Object Detection with Reverse Contrast Attention

Drandreb Earl O. Juanico

Rowel O. Atienza

Jeffrey Kenneth Go

ObjD

316

26 Jul 2025

SFT-GO: Supervised Fine-Tuning with Group Optimization for Large Language Models

196

17 Jun 2025

Rethinking Explainability in the Era of Multimodal AI

Chirag Agarwal

304

16 Jun 2025

Towards Large Language Models with Self-Consistent Natural Language Explanations

225

09 Jun 2025

Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety

385

05 Jun 2025

Interpretable phenotyping of Heart Failure patients with Dutch discharge letters

Vittorio Torri

Machteld J. Boonstra

Marielle C. van de Veerdonk

Folkert W. Asselbergs

Iacer Calixto

187

30 May 2025

From Directions to Cones: Exploring Multidimensional Representations of Propositional Facts in LLMs

247

27 May 2025

LayerIF: Estimating Layer Quality for Large Language Models using Influence Functions

471

27 May 2025

SCAR: Shapley Credit Assignment for More Efficient RLHF

453

26 May 2025

ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior

534

26 May 2025

Advancing the Scientific Method with Large Language Models: From Hypothesis to Discovery

...

453

22 May 2025