v1v2 (latest)

Interpretation of Neural Networks is Fragile

AAAI Conference on Artificial Intelligence (AAAI), 2017

29 October 2017

Papers citing "Interpretation of Neural Networks is Fragile"

50 / 489 papers shown

ABLE: Using Adversarial Pairs to Construct Local Models for Explaining Model Predictions

344

26 Nov 2025

Accuracy is Not Enough: Poisoning Interpretability in Federated Learning via Color Skew

559

17 Nov 2025

Rethinking Saliency Maps: A Cognitive Human Aligned Taxonomy and Evaluation Framework for Explanations

278

17 Nov 2025

Did Models Sufficient Learn? Attribution-Guided Training via Subset-Selected Counterfactual Augmentation

112

15 Nov 2025

Stable Prediction of Adverse Events in Medical Time-Series Data

114

16 Oct 2025

Restricted Receptive Fields for Face Verification

209

12 Oct 2025

Attack logics, not outputs: Towards efficient robustification of deep neural networks by falsifying concept-based properties

Raik Dankworth

Gesina Schwalbe

AAML

120

01 Oct 2025

ORACLE: Explaining Feature Interactions in Neural Networks with ANOVA

Dongseok Kim

Wonjun Jeong

Mohamed Jismy Aashik Rasool

Gisung Oh

187

13 Sep 2025

Beyond Output Faithfulness: Learning Attributions that Preserve Computational Pathways

Siyu Zhang

Kenneth Mcmillan

205

04 Sep 2025

GPLight+: A Genetic Programming Method for Learning Symmetric Traffic Signal Control PolicyIEEE Transactions on Evolutionary Computation (IEEE Trans. Evol. Comput.), 2025

Xiao-Cheng Liao

Yi Mei

Mengjie Zhang

22 Aug 2025

On the notion of missingness for path attribution explainability methods in medical settings: Guiding the selection of medically meaningful baselines

320

20 Aug 2025

Attribution Explanations for Deep Neural Networks: A Theoretical Perspective

160

11 Aug 2025

Concept Learning for Cooperative Multi-Agent Reinforcement Learning

Zhonghan Ge

Yuanyang Zhu

Chunlin Chen

162

27 Jul 2025

Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning via Incorporating Generalized Human Expertise

156

25 Jul 2025

Breaking the Illusion of Security via Interpretation: Interpretable Vision Transformer Systems under Attack

134

18 Jul 2025

On the Effectiveness of Methods and Metrics for Explainable AI in Remote Sensing Image Scene ClassificationIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (IEEE J-STARS), 2025

Jonas Klotz

Tom Burgert

Tim Siebert

394

08 Jul 2025

Pixel-level Certified Explanations via Randomized Smoothing

239

18 Jun 2025

TriGuard: Testing Model Safety with Attribution Entropy, Verification, and Drift

160

17 Jun 2025

Can Hessian-Based Insights Support Fault Diagnosis in Attention-based Models?

Sigma Jahan

Mohammad Masudur Rahman

166

09 Jun 2025

Fixed Point Explainability

418

18 May 2025

Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc MethodsConference on Fairness, Accountability and Transparency (FAccT), 2025

313

02 May 2025

Financial Fraud Detection with Entropy Computing

151

14 Mar 2025

Axiomatic Explainer Globalness via Optimal TransportInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024

497

13 Mar 2025

Birds look like cars: Adversarial analysis of intrinsically interpretable deep learning

Hubert Baniecki

P. Biecek

AAML

418

11 Mar 2025

Conceptual Contrastive Edits in Textual and Vision-Language Retrieval

Maria Lymperaiou

Giorgos Stamou

VLM

327

01 Mar 2025

Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models

298

18 Feb 2025

Error-controlled non-additive interaction discovery in machine learning modelsNature Machine Intelligence (Nat. Mach. Intell.), 2024

Winston Chen

Yifan Jiang

William Stafford Noble

Yang Young Lu

335

17 Feb 2025

We Can't Understand AI Using our Existing Vocabulary

John Hewitt

Robert Geirhos

Been Kim

323

11 Feb 2025

The Effect of Similarity Measures on Accurate Stability Estimates for Local Surrogate Models in Text-based Explainable AI

387

20 Jan 2025

Explainable Adversarial Attacks on Coarse-to-Fine ClassifiersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

103

19 Jan 2025

Interpreting Deep Neural Network-Based Receiver Under Varying Signal-To-Noise RatiosIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

328

10 Jan 2025

Towards Robust and Accurate Stability Estimation of Local Surrogate Models in Text-based Explainable AI

248

03 Jan 2025

Impact of Adversarial Attacks on Deep Learning Model Explainability

Gazi Nazia Nur

Mohammad Ahnaf Sadat

AAML FAtt

280

15 Dec 2024

Quantized and Interpretable Learning Scheme for Deep Neural Networks in Classification TaskConference Information and Communication Technology (ICT), 2024

247

05 Dec 2024

Machines and Mathematical Mutations: Using GNNs to Characterize Quiver Mutation Classes

291

12 Nov 2024

EXAGREE: Mitigating Explanation Disagreement with Stakeholder-Aligned Models

246

04 Nov 2024

Transparent Trade-offs between Properties of ExplanationsConference on Uncertainty in Artificial Intelligence (UAI), 2024

394

31 Oct 2024

CausAdv: A Causal-based Framework for Detecting Adversarial Examples

Hichem Debbi

CML AAML

283

29 Oct 2024

Prototype-Based Methods in Explainable AI and Emerging Opportunities in the Geosciences

Anushka Narayanan

Karianne J. Bergen

308

22 Oct 2024

SSET: Swapping-Sliding Explanation for Time Series Classifiers in Affect Detection

241

16 Oct 2024

Unlearning-based Neural InterpretationsInternational Conference on Learning Representations (ICLR), 2024

590

10 Oct 2024

Faithful Interpretation for Graph Neural Networks

Lijie Hu

Tianhao Huang

Lu Yu

Wanyu Lin

Tianhang Zheng

Di Wang

261

09 Oct 2024

A mechanistically interpretable neural network for regulatory genomics

142

08 Oct 2024

Understanding with toy surrogate models in machine learning

Andrés Páez

SyDa

235

08 Oct 2024

Mechanistic?BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024

Naomi Saphra

Sarah Wiegreffe

AI4CE

257

07 Oct 2024

Run-time Observation Interventions Make Vision-Language-Action Models More Visually RobustIEEE International Conference on Robotics and Automation (ICRA), 2024

208

02 Oct 2024

Faithfulness and the Notion of Adversarial Sensitivity in NLP ExplanationsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024

Supriya Manna

Niladri Sett

AAML

339

26 Sep 2024

Deep Manifold Part 1: Anatomy of Neural Network Manifold

Max Y. Ma

Gen-Hua Shi

PINN 3DPC

169

26 Sep 2024

Trustworthy Text-to-Image Diffusion Models: A Timely and Focused Survey

459

26 Sep 2024

Leveraging Local Structure for Improving Model Explanations: An Information Propagation ApproachInternational Conference on Information and Knowledge Management (CIKM), 2024

242

24 Sep 2024