ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1710.10547
  4. Cited By
Interpretation of Neural Networks is Fragile
v1v2 (latest)

Interpretation of Neural Networks is Fragile

AAAI Conference on Artificial Intelligence (AAAI), 2017
29 October 2017
Amirata Ghorbani
Abubakar Abid
James Zou
    FAttAAML
ArXiv (abs)PDFHTML

Papers citing "Interpretation of Neural Networks is Fragile"

50 / 489 papers shown
Title
ORACLE: Explaining Feature Interactions in Neural Networks with ANOVA
ORACLE: Explaining Feature Interactions in Neural Networks with ANOVA
Dongseok Kim
Wonjun Jeong
Mohamed Jismy Aashik Rasool
Gisung Oh
155
0
0
24 Dec 2025
ABLE: Using Adversarial Pairs to Construct Local Models for Explaining Model Predictions
ABLE: Using Adversarial Pairs to Construct Local Models for Explaining Model Predictions
Krishna Khadka
Sunny Shree
Pujan Budhathoki
Yu Lei
Raghu Kacker
D. Richard Kuhn
AAMLFAtt
333
0
0
26 Nov 2025
Accuracy is Not Enough: Poisoning Interpretability in Federated Learning via Color Skew
Accuracy is Not Enough: Poisoning Interpretability in Federated Learning via Color Skew
Farhin Farhad Riya
Shahinul Hoque
J. Sun
Olivera Kotevska
AAMLFedMLFAtt
535
0
0
17 Nov 2025
Rethinking Saliency Maps: A Cognitive Human Aligned Taxonomy and Evaluation Framework for Explanations
Rethinking Saliency Maps: A Cognitive Human Aligned Taxonomy and Evaluation Framework for Explanations
Yehonatan Elisha
Seffi Cohen
Oren Barkan
Noam Koenigstein
FAtt
271
0
0
17 Nov 2025
Did Models Sufficient Learn? Attribution-Guided Training via Subset-Selected Counterfactual Augmentation
Did Models Sufficient Learn? Attribution-Guided Training via Subset-Selected Counterfactual Augmentation
Yannan Chen
Ruoyu Chen
Bin Zeng
Wei Wang
Shiming Liu
Qunli Zhang
Zheng Hu
L. Wang
Y. Wang
Xiaochun Cao
100
0
0
15 Nov 2025
Stable Prediction of Adverse Events in Medical Time-Series Data
Stable Prediction of Adverse Events in Medical Time-Series Data
Mayank Keoliya
Seewon Choi
Rajeev Alur
Mayur Naik
Eric Wong
OOD
92
0
0
16 Oct 2025
Restricted Receptive Fields for Face Verification
Restricted Receptive Fields for Face Verification
Kagan Öztürk
Aman Bhatta
Haiyu Wu
Patrick Flynn
Kevin W. Bowyer
CVBMFAtt
193
0
0
12 Oct 2025
Attack logics, not outputs: Towards efficient robustification of deep neural networks by falsifying concept-based properties
Attack logics, not outputs: Towards efficient robustification of deep neural networks by falsifying concept-based properties
Raik Dankworth
Gesina Schwalbe
AAML
108
0
0
01 Oct 2025
Beyond Output Faithfulness: Learning Attributions that Preserve Computational Pathways
Beyond Output Faithfulness: Learning Attributions that Preserve Computational Pathways
Siyu Zhang
Kenneth Mcmillan
191
0
0
04 Sep 2025
GPLight+: A Genetic Programming Method for Learning Symmetric Traffic Signal Control Policy
GPLight+: A Genetic Programming Method for Learning Symmetric Traffic Signal Control PolicyIEEE Transactions on Evolutionary Computation (IEEE Trans. Evol. Comput.), 2025
Xiao-Cheng Liao
Yi Mei
Mengjie Zhang
89
2
0
22 Aug 2025
On the notion of missingness for path attribution explainability methods in medical settings: Guiding the selection of medically meaningful baselines
On the notion of missingness for path attribution explainability methods in medical settings: Guiding the selection of medically meaningful baselines
Alexander Geiger
Lars Wagner
Daniel Rueckert
Dirk Wilhelm
A. Jell
OODBDLMedIm
311
0
0
20 Aug 2025
Attribution Explanations for Deep Neural Networks: A Theoretical Perspective
Attribution Explanations for Deep Neural Networks: A Theoretical Perspective
Huiqi Deng
Hongbin Pei
Quanshi Zhang
Mengnan Du
FAtt
158
1
0
11 Aug 2025
Concept Learning for Cooperative Multi-Agent Reinforcement Learning
Concept Learning for Cooperative Multi-Agent Reinforcement Learning
Zhonghan Ge
Yuanyang Zhu
Chunlin Chen
145
0
0
27 Jul 2025
Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning via Incorporating Generalized Human Expertise
Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning via Incorporating Generalized Human Expertise
Xuefei Wu
Xiao Yin
Yuanyang Zhu
Chunlin Chen
152
3
0
25 Jul 2025
Breaking the Illusion of Security via Interpretation: Interpretable Vision Transformer Systems under Attack
Breaking the Illusion of Security via Interpretation: Interpretable Vision Transformer Systems under Attack
Eldor Abdukhamidov
Mohammed Abuhamad
Simon S. Woo
Hyoungshick Kim
Tamer Abuhmed
AAML
107
0
0
18 Jul 2025
On the Effectiveness of Methods and Metrics for Explainable AI in Remote Sensing Image Scene Classification
On the Effectiveness of Methods and Metrics for Explainable AI in Remote Sensing Image Scene ClassificationIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (IEEE J-STARS), 2025
Jonas Klotz
Tom Burgert
Tim Siebert
333
0
0
08 Jul 2025
Pixel-level Certified Explanations via Randomized Smoothing
Pixel-level Certified Explanations via Randomized Smoothing
Alaa Anani
Tobias Lorenz
Mario Fritz
Bernt Schiele
FAttAAML
232
1
0
18 Jun 2025
TriGuard: Testing Model Safety with Attribution Entropy, Verification, and Drift
TriGuard: Testing Model Safety with Attribution Entropy, Verification, and Drift
Dipesh Tharu Mahato
Rohan Poudel
Pramod Dhungana
AAML
147
0
0
17 Jun 2025
Can Hessian-Based Insights Support Fault Diagnosis in Attention-based Models?
Can Hessian-Based Insights Support Fault Diagnosis in Attention-based Models?
Sigma Jahan
Mohammad Masudur Rahman
152
0
0
09 Jun 2025
Fixed Point Explainability
Fixed Point Explainability
Emanuele La Malfa
Jon Vadillo
Marco Molinari
Michael Wooldridge
398
0
0
18 May 2025
Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods
Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc MethodsConference on Fairness, Accountability and Transparency (FAccT), 2025
Mahdi Dhaini
Ege Erdogan
Nils Feldhus
Gjergji Kasneci
301
1
0
02 May 2025
Financial Fraud Detection with Entropy Computing
Babak Emami
Wesley Dyk
David Haycraft
Carrie Spear
Lac Nguyen
Nicholas Chancellor
147
0
0
14 Mar 2025
Axiomatic Explainer Globalness via Optimal Transport
Axiomatic Explainer Globalness via Optimal TransportInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Davin Hill
Josh Bone
A. Masoomi
Max Torop
Jennifer Dy
485
2
0
13 Mar 2025
Birds look like cars: Adversarial analysis of intrinsically interpretable deep learning
Birds look like cars: Adversarial analysis of intrinsically interpretable deep learning
Hubert Baniecki
P. Biecek
AAML
402
1
0
11 Mar 2025
Conceptual Contrastive Edits in Textual and Vision-Language Retrieval
Maria Lymperaiou
Giorgos Stamou
VLM
322
0
0
01 Mar 2025
Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models
Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models
Thomas Fel
Ekdeep Singh Lubana
Jacob S. Prince
M. Kowal
Victor Boutin
Isabel Papadimitriou
Binxu Wang
Martin Wattenberg
Demba Ba
Talia Konkle
267
28
0
18 Feb 2025
Error-controlled non-additive interaction discovery in machine learning models
Error-controlled non-additive interaction discovery in machine learning modelsNature Machine Intelligence (Nat. Mach. Intell.), 2024
Winston Chen
Yifan Jiang
William Stafford Noble
Yang Young Lu
333
2
0
17 Feb 2025
We Can't Understand AI Using our Existing Vocabulary
We Can't Understand AI Using our Existing Vocabulary
John Hewitt
Robert Geirhos
Been Kim
307
13
0
11 Feb 2025
The Effect of Similarity Measures on Accurate Stability Estimates for Local Surrogate Models in Text-based Explainable AI
The Effect of Similarity Measures on Accurate Stability Estimates for Local Surrogate Models in Text-based Explainable AI
Christopher Burger
Charles Walter
Thai Le
AAML
363
3
0
20 Jan 2025
Explainable Adversarial Attacks on Coarse-to-Fine Classifiers
Explainable Adversarial Attacks on Coarse-to-Fine ClassifiersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Akram Heidarizadeh
Connor Hatfield
Lorenzo Lazzarotto
HanQin Cai
George Atia
AAML
102
0
0
19 Jan 2025
Interpreting Deep Neural Network-Based Receiver Under Varying Signal-To-Noise Ratios
Interpreting Deep Neural Network-Based Receiver Under Varying Signal-To-Noise RatiosIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Marko Tuononen
Dani Korpi
Ville Hautamäki
FAtt
288
2
0
10 Jan 2025
Towards Robust and Accurate Stability Estimation of Local Surrogate Models in Text-based Explainable AI
Christopher Burger
Charles Walter
Thai Le
Lingwei Chen
AAML
232
1
0
03 Jan 2025
Impact of Adversarial Attacks on Deep Learning Model Explainability
Impact of Adversarial Attacks on Deep Learning Model Explainability
Gazi Nazia Nur
Mohammad Ahnaf Sadat
AAMLFAtt
263
2
0
15 Dec 2024
Quantized and Interpretable Learning Scheme for Deep Neural Networks in
  Classification Task
Quantized and Interpretable Learning Scheme for Deep Neural Networks in Classification TaskConference Information and Communication Technology (ICT), 2024
Alireza Maleki
Mahsa Lavaei
Mohsen Bagheritabar
Salar Beigzad
Zahra Abadi
MQ
247
3
0
05 Dec 2024
Machines and Mathematical Mutations: Using GNNs to Characterize Quiver Mutation Classes
Machines and Mathematical Mutations: Using GNNs to Characterize Quiver Mutation Classes
Jesse He
Helen Jenne
Herman Chau
Davis Brown
Mark Raugas
Sara Billey
Henry Kvinge
251
3
0
12 Nov 2024
EXAGREE: Mitigating Explanation Disagreement with Stakeholder-Aligned Models
EXAGREE: Mitigating Explanation Disagreement with Stakeholder-Aligned Models
Sichao Li
Tommy Liu
Quanling Deng
Amanda S. Barnard
216
1
0
04 Nov 2024
Transparent Trade-offs between Properties of Explanations
Transparent Trade-offs between Properties of ExplanationsConference on Uncertainty in Artificial Intelligence (UAI), 2024
Hiwot Belay Tadesse
Alihan Hüyük
Yaniv Yacoby
Weiwei Pan
Finale Doshi-Velez
FAtt
373
0
0
31 Oct 2024
CausAdv: A Causal-based Framework for Detecting Adversarial Examples
CausAdv: A Causal-based Framework for Detecting Adversarial Examples
Hichem Debbi
CMLAAML
213
1
0
29 Oct 2024
Prototype-Based Methods in Explainable AI and Emerging Opportunities in
  the Geosciences
Prototype-Based Methods in Explainable AI and Emerging Opportunities in the Geosciences
Anushka Narayanan
Karianne J. Bergen
300
5
0
22 Oct 2024
SSET: Swapping-Sliding Explanation for Time Series Classifiers in Affect
  Detection
SSET: Swapping-Sliding Explanation for Time Series Classifiers in Affect Detection
Nazanin Fouladgar
Marjan Alirezaie
Kary Främling
AI4TSFAtt
208
0
0
16 Oct 2024
Unlearning-based Neural Interpretations
Unlearning-based Neural InterpretationsInternational Conference on Learning Representations (ICLR), 2024
Ching Lam Choi
Alexandre Duplessis
Serge Belongie
FAtt
567
0
0
10 Oct 2024
Faithful Interpretation for Graph Neural Networks
Faithful Interpretation for Graph Neural Networks
Lijie Hu
Tianhao Huang
Lu Yu
Wanyu Lin
Tianhang Zheng
Di Wang
242
3
0
09 Oct 2024
A mechanistically interpretable neural network for regulatory genomics
A mechanistically interpretable neural network for regulatory genomics
Alex Tseng
Gökçen Eraslan
Tommaso Biancalani
Gabriele Scalia
125
3
0
08 Oct 2024
Understanding with toy surrogate models in machine learning
Understanding with toy surrogate models in machine learning
Andrés Páez
SyDa
200
2
0
08 Oct 2024
Mechanistic?
Mechanistic?BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024
Naomi Saphra
Sarah Wiegreffe
AI4CE
225
32
0
07 Oct 2024
Run-time Observation Interventions Make Vision-Language-Action Models
  More Visually Robust
Run-time Observation Interventions Make Vision-Language-Action Models More Visually RobustIEEE International Conference on Robotics and Automation (ICRA), 2024
Asher Hancock
Allen Z. Ren
Anirudha Majumdar
VLM
200
18
0
02 Oct 2024
Faithfulness and the Notion of Adversarial Sensitivity in NLP
  Explanations
Faithfulness and the Notion of Adversarial Sensitivity in NLP ExplanationsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024
Supriya Manna
Niladri Sett
AAML
328
3
0
26 Sep 2024
Deep Manifold Part 1: Anatomy of Neural Network Manifold
Deep Manifold Part 1: Anatomy of Neural Network Manifold
Max Y. Ma
Gen-Hua Shi
PINN3DPC
168
0
0
26 Sep 2024
Trustworthy Text-to-Image Diffusion Models: A Timely and Focused Survey
Trustworthy Text-to-Image Diffusion Models: A Timely and Focused Survey
Yi Zhang
Zhen Chen
Chih-Hong Cheng
Wenjie Ruan
Xiaowei Huang
Dezong Zhao
David Flynn
Siddartha Khastgir
Xingyu Zhao
MedIm
439
6
0
26 Sep 2024
Leveraging Local Structure for Improving Model Explanations: An
  Information Propagation Approach
Leveraging Local Structure for Improving Model Explanations: An Information Propagation ApproachInternational Conference on Information and Knowledge Management (CIKM), 2024
Ruo Yang
Binghui Wang
M. Bilgic
FAtt
222
1
0
24 Sep 2024
1234...8910
Next