ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1710.10547
  4. Cited By
Interpretation of Neural Networks is Fragile
v1v2 (latest)

Interpretation of Neural Networks is Fragile

AAAI Conference on Artificial Intelligence (AAAI), 2017
29 October 2017
Amirata Ghorbani
Abubakar Abid
James Zou
    FAttAAML
ArXiv (abs)PDFHTML

Papers citing "Interpretation of Neural Networks is Fragile"

50 / 489 papers shown
Title
Concept-based Explainable Artificial Intelligence: A Survey
Concept-based Explainable Artificial Intelligence: A Survey
Eleonora Poeta
Gabriele Ciravegna
Eliana Pastor
Tania Cerquitelli
Elena Baralis
LRMXAI
238
84
0
20 Dec 2023
CEIR: Concept-based Explainable Image Representation Learning
CEIR: Concept-based Explainable Image Representation Learning
Yan Cui
Shuhong Liu
Liuzhuozheng Li
Zhiyuan Yuan
SSLVLM
132
5
0
17 Dec 2023
Rethinking Robustness of Model Attributions
Rethinking Robustness of Model AttributionsAAAI Conference on Artificial Intelligence (AAAI), 2023
Sandesh Kamath
Sankalp Mittal
Amit Deshpande
Vineeth N. Balasubramanian
206
2
0
16 Dec 2023
CIDR: A Cooperative Integrated Dynamic Refining Method for Minimal
  Feature Removal Problem
CIDR: A Cooperative Integrated Dynamic Refining Method for Minimal Feature Removal ProblemAAAI Conference on Artificial Intelligence (AAAI), 2023
Qian Chen
Tao Zhang
Dongyang Li
Xiaofeng He
410
0
0
13 Dec 2023
TaCo: Targeted Concept Removal in Output Embeddings for NLP via
  Information Theory and Explainability
TaCo: Targeted Concept Removal in Output Embeddings for NLP via Information Theory and Explainability
Fanny Jourdan
Louis Bethune
Agustin Picard
Laurent Risser
Nicholas M. Asher
182
0
0
11 Dec 2023
Interpretability Illusions in the Generalization of Simplified Models
Interpretability Illusions in the Generalization of Simplified Models
Dan Friedman
Andrew Kyle Lampinen
Lucas Dixon
Danqi Chen
Asma Ghandeharioun
317
19
0
06 Dec 2023
Interpretable Knowledge Tracing via Response Influence-based
  Counterfactual Reasoning
Interpretable Knowledge Tracing via Response Influence-based Counterfactual ReasoningIEEE International Conference on Data Engineering (ICDE), 2023
Jiajun Cui
Minghe Yu
Bo Jiang
Aimin Zhou
Jianyong Wang
Wei Zhang
249
7
0
01 Dec 2023
Improving Interpretation Faithfulness for Vision Transformers
Improving Interpretation Faithfulness for Vision TransformersInternational Conference on Machine Learning (ICML), 2023
Lijie Hu
Yixin Liu
Ninghao Liu
Mengdi Huai
Lichao Sun
Haiyan Zhao
203
12
0
29 Nov 2023
Uncertainty in Additive Feature Attribution methods
Uncertainty in Additive Feature Attribution methods
Abhishek Madaan
Tanya Chowdhury
Neha Rana
James Allan
Tanmoy Chakraborty
175
0
0
29 Nov 2023
FocusLearn: Fully-Interpretable, High-Performance Modular Neural
  Networks for Time Series
FocusLearn: Fully-Interpretable, High-Performance Modular Neural Networks for Time SeriesIEEE International Joint Conference on Neural Network (IJCNN), 2023
Qiqi Su
Christos Kloukinas
Artur dÁvila Garcez
AI4TS
317
8
0
28 Nov 2023
Survey on AI Ethics: A Socio-technical Perspective
Survey on AI Ethics: A Socio-technical PerspectiveInternational Conference on Climate Informatics (ICCI), 2023
Dave Mbiazi
Meghana Bhange
Maryam Babaei
Ivaxi Sheth
Patrik Kenfack
Samira Ebrahimi Kahou
391
9
0
28 Nov 2023
MRxaI: Black-Box Explainability for Image Classifiers in a Medical
  Setting
MRxaI: Black-Box Explainability for Image Classifiers in a Medical Setting
Nathan Blake
Hana Chockler
David A. Kelly
Santiago Calderón-Pena
Akchunya Chanchal
114
5
0
24 Nov 2023
On the Relationship Between Interpretability and Explainability in
  Machine Learning
On the Relationship Between Interpretability and Explainability in Machine Learning
Benjamin Leblanc
Pascal Germain
FaML
390
1
0
20 Nov 2023
Explainability of Vision Transformers: A Comprehensive Review and New
  Perspectives
Explainability of Vision Transformers: A Comprehensive Review and New Perspectives
Rojina Kashefi
Leili Barekatain
Mohammad Sabokrou
Fatemeh Aghaeipoor
ViT
262
19
0
12 Nov 2023
SCAAT: Improving Neural Network Interpretability via Saliency
  Constrained Adaptive Adversarial Training
SCAAT: Improving Neural Network Interpretability via Saliency Constrained Adaptive Adversarial Training
Rui Xu
Wenkang Qin
Peixiang Huang
Hao Wang
Lin Luo
FAttAAML
262
3
0
09 Nov 2023
SmoothHess: ReLU Network Feature Interactions via Stein's Lemma
SmoothHess: ReLU Network Feature Interactions via Stein's LemmaNeural Information Processing Systems (NeurIPS), 2023
Max Torop
A. Masoomi
Davin Hill
Kivanc Kose
Stratis Ioannidis
Jennifer Dy
320
7
0
01 Nov 2023
Corrupting Neuron Explanations of Deep Visual Features
Corrupting Neuron Explanations of Deep Visual FeaturesIEEE International Conference on Computer Vision (ICCV), 2023
Divyansh Srivastava
Tuomas P. Oikarinen
Tsui-Wei Weng
FAttAAML
114
3
0
25 Oct 2023
Explanation-based Training with Differentiable Insertion/Deletion
  Metric-aware Regularizers
Explanation-based Training with Differentiable Insertion/Deletion Metric-aware RegularizersInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Yuya Yoshikawa
Tomoharu Iwata
223
1
0
19 Oct 2023
Can Large Language Models Explain Themselves? A Study of LLM-Generated
  Self-Explanations
Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations
Shiyuan Huang
Siddarth Mamidanna
Shreedhar Jangam
Yilun Zhou
Leilani H. Gilpin
LRMMILMELM
356
107
0
17 Oct 2023
A New Baseline Assumption of Integated Gradients Based on Shaply value
A New Baseline Assumption of Integated Gradients Based on Shaply value
Shuyang Liu
Zixuan Chen
Ge Shi
Ji Wang
Changjie Fan
Yu Xiong
Runze Wu Yujing Hu
Ze Ji
Yang Gao
235
2
0
07 Oct 2023
The Blame Problem in Evaluating Local Explanations, and How to Tackle it
The Blame Problem in Evaluating Local Explanations, and How to Tackle it
Amir Hossein Akhavan Rahnama
ELMFAtt
241
7
0
05 Oct 2023
SMOOT: Saliency Guided Mask Optimized Online Training
SMOOT: Saliency Guided Mask Optimized Online Training
Ali Karkehabadi
Houman Homayoun
Avesta Sasan
AAML
277
21
0
01 Oct 2023
Counterfactual Image Generation for adversarially robust and
  interpretable Classifiers
Counterfactual Image Generation for adversarially robust and interpretable Classifiers
Rafael Bischof
Florian Scheidegger
Michael A. Kraus
A. Malossi
AAML
202
2
0
01 Oct 2023
Faithful Explanations of Black-box NLP Models Using LLM-generated
  Counterfactuals
Faithful Explanations of Black-box NLP Models Using LLM-generated CounterfactualsInternational Conference on Learning Representations (ICLR), 2023
Y. Gat
Nitay Calderon
Amir Feder
Alexander Chapanin
Amit Sharma
Roi Reichart
380
42
0
01 Oct 2023
Black-box Attacks on Image Activity Prediction and its Natural Language
  Explanations
Black-box Attacks on Image Activity Prediction and its Natural Language Explanations
Alina Elena Baia
Valentina Poggioni
Andrea Cavallaro
AAML
192
1
0
30 Sep 2023
DeepROCK: Error-controlled interaction detection in deep neural networks
DeepROCK: Error-controlled interaction detection in deep neural networks
Winston Chen
William Stafford Noble
Y. Lu
262
1
0
26 Sep 2023
Concept explainability for plant diseases classification
Concept explainability for plant diseases classificationVISIGRAPP (VISIGRAPP), 2023
Jihen Amara
B. König-Ries
Sheeba Samuel
FAtt
98
2
0
15 Sep 2023
Interpretability is in the Mind of the Beholder: A Causal Framework for
  Human-interpretable Representation Learning
Interpretability is in the Mind of the Beholder: A Causal Framework for Human-interpretable Representation LearningEntropy (Entropy), 2023
Emanuele Marconato
Baptiste Caramiaux
Stefano Teso
264
19
0
14 Sep 2023
Automatic Concept Embedding Model (ACEM): No train-time concepts, No
  issue!
Automatic Concept Embedding Model (ACEM): No train-time concepts, No issue!
Rishabh Jain
LRM
106
0
0
07 Sep 2023
Goodhart's Law Applies to NLP's Explanation Benchmarks
Goodhart's Law Applies to NLP's Explanation BenchmarksFindings (Findings), 2023
Jennifer Hsia
Danish Pruthi
Aarti Singh
Zachary Chase Lipton
203
7
0
28 Aug 2023
On the Interpretability of Quantum Neural Networks
On the Interpretability of Quantum Neural NetworksQuantum Machine Intelligence (QMI), 2023
Lirande Pira
C. Ferrie
FAtt
188
29
0
22 Aug 2023
FUTURE-AI: International consensus guideline for trustworthy and
  deployable artificial intelligence in healthcare
FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcareBritish medical journal (BMJ), 2023
Karim Lekadir
Aasa Feragen
Abdul Joseph Fofanah
Alejandro F Frangi
Alena Buyx
...
Yi Zeng
Yunusa G Mohammed
Yves Saint James Aquino
Zohaib Salahuddin
M. P. Starmans
AI4TS
297
206
0
11 Aug 2023
A New Perspective on Evaluation Methods for Explainable Artificial
  Intelligence (XAI)
A New Perspective on Evaluation Methods for Explainable Artificial Intelligence (XAI)
Timo Speith
Markus Langer
205
16
0
26 Jul 2023
Saliency strikes back: How filtering out high frequencies improves
  white-box explanations
Saliency strikes back: How filtering out high frequencies improves white-box explanationsInternational Conference on Machine Learning (ICML), 2023
Sabine Muzellec
Thomas Fel
Victor Boutin
Léo Andéol
R. V. Rullen
Thomas Serre
FAtt
415
3
0
18 Jul 2023
SHAMSUL: Systematic Holistic Analysis to investigate Medical
  Significance Utilizing Local interpretability methods in deep learning for
  chest radiography pathology prediction
SHAMSUL: Systematic Holistic Analysis to investigate Medical Significance Utilizing Local interpretability methods in deep learning for chest radiography pathology predictionNordic Machine Intelligence (NMI), 2023
Mahbub Ul Alam
Jaakko Hollmén
Jón R. Baldvinsson
R. Rahmani
FAtt
424
3
0
16 Jul 2023
On the Connection between Game-Theoretic Feature Attributions and
  Counterfactual Explanations
On the Connection between Game-Theoretic Feature Attributions and Counterfactual ExplanationsAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2023
Emanuele Albini
Sanjay Kariyappa
Saumitra Mishra
Danial Dervovic
Daniele Magazzeni
FAtt
250
3
0
13 Jul 2023
Single-Class Target-Specific Attack against Interpretable Deep Learning
  Systems
Single-Class Target-Specific Attack against Interpretable Deep Learning Systems
Eldor Abdukhamidov
Mohammed Abuhamad
George K. Thiruvathukal
Hyoungshick Kim
Tamer Abuhmed
AAML
187
2
0
12 Jul 2023
Stability Guarantees for Feature Attributions with Multiplicative
  Smoothing
Stability Guarantees for Feature Attributions with Multiplicative SmoothingNeural Information Processing Systems (NeurIPS), 2023
Anton Xue
Rajeev Alur
Eric Wong
313
12
0
12 Jul 2023
Hierarchical Semantic Tree Concept Whitening for Interpretable Image
  Classification
Hierarchical Semantic Tree Concept Whitening for Interpretable Image Classification
Haixing Dai
Lu Zhang
Lin Zhao
Zihao Wu
Zheng Liu
...
Yanjun Lyu
Changying Li
Ninghao Liu
Tianming Liu
Dajiang Zhu
238
7
0
10 Jul 2023
Robust Ranking Explanations
Robust Ranking Explanations
Chao Chen
Chenghua Guo
Guixiang Ma
Ming Zeng
Xi Zhang
Sihong Xie
FAttAAML
335
0
0
08 Jul 2023
A Vulnerability of Attribution Methods Using Pre-Softmax Scores
A Vulnerability of Attribution Methods Using Pre-Softmax Scores
Miguel A. Lerma
Mirtha Lucas
FAtt
230
0
0
06 Jul 2023
DARE: Towards Robust Text Explanations in Biomedical and Healthcare
  Applications
DARE: Towards Robust Text Explanations in Biomedical and Healthcare ApplicationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Adam Ivankay
Mattia Rigotti
P. Frossard
OODMedIm
184
1
0
05 Jul 2023
Fixing confirmation bias in feature attribution methods via semantic
  match
Fixing confirmation bias in feature attribution methods via semantic match
Giovanni Cina
Daniel Fernandez-Llaneza
Ludovico Deponte
Nishant Mishra
Tabea E. Rober
Sandro Pezzelle
Iacer Calixto
Rob Goedhart
cS. .Ilker Birbil
FAtt
229
3
0
03 Jul 2023
SHARCS: Shared Concept Space for Explainable Multimodal Learning
SHARCS: Shared Concept Space for Explainable Multimodal Learning
Gabriele Dominici
Pietro Barbiero
Lucie Charlotte Magister
Pietro Lio
Nikola Simidjievski
158
6
0
01 Jul 2023
Verifying Safety of Neural Networks from Topological Perspectives
Verifying Safety of Neural Networks from Topological PerspectivesScience of Computer Programming (SCP), 2023
Zhen Liang
Dejin Ren
Bai Xue
Jing Wang
Wenjing Yang
Wanwei Liu
AAML
212
0
0
27 Jun 2023
Requirements for Explainability and Acceptance of Artificial
  Intelligence in Collaborative Work
Requirements for Explainability and Acceptance of Artificial Intelligence in Collaborative WorkInteracción (HCI), 2023
Sabine Theis
Sophie F. Jentzsch
Fotini Deligiannaki
C. Berro
A. Raulf
C. Bruder
178
13
0
27 Jun 2023
Evaluating the overall sensitivity of saliency-based explanation methods
Evaluating the overall sensitivity of saliency-based explanation methods
Harshinee Sriram
Cristina Conati
AAMLXAIFAtt
150
0
0
21 Jun 2023
On the Robustness of Removal-Based Feature Attributions
On the Robustness of Removal-Based Feature AttributionsNeural Information Processing Systems (NeurIPS), 2023
Christy Lin
Ian Covert
Su-In Lee
361
18
0
12 Jun 2023
A Holistic Approach to Unifying Automatic Concept Extraction and Concept
  Importance Estimation
A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance EstimationNeural Information Processing Systems (NeurIPS), 2023
Thomas Fel
Victor Boutin
Mazda Moayeri
Rémi Cadène
Louis Bethune
Léo Andéol
Mathieu Chalvidal
Thomas Serre
FAtt
299
83
0
11 Jun 2023
On Minimizing the Impact of Dataset Shifts on Actionable Explanations
On Minimizing the Impact of Dataset Shifts on Actionable ExplanationsConference on Uncertainty in Artificial Intelligence (UAI), 2023
Anna P. Meyer
Dan Ley
Suraj Srinivas
Himabindu Lakkaraju
FAtt
191
6
0
11 Jun 2023
Previous
123456...8910
Next