Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1710.10547
Cited By
v1
v2 (latest)
Interpretation of Neural Networks is Fragile
AAAI Conference on Artificial Intelligence (AAAI), 2017
29 October 2017
Amirata Ghorbani
Abubakar Abid
James Zou
FAtt
AAML
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Interpretation of Neural Networks is Fragile"
50 / 489 papers shown
Title
Concept-based Explainable Artificial Intelligence: A Survey
Eleonora Poeta
Gabriele Ciravegna
Eliana Pastor
Tania Cerquitelli
Elena Baralis
LRM
XAI
238
84
0
20 Dec 2023
CEIR: Concept-based Explainable Image Representation Learning
Yan Cui
Shuhong Liu
Liuzhuozheng Li
Zhiyuan Yuan
SSL
VLM
132
5
0
17 Dec 2023
Rethinking Robustness of Model Attributions
AAAI Conference on Artificial Intelligence (AAAI), 2023
Sandesh Kamath
Sankalp Mittal
Amit Deshpande
Vineeth N. Balasubramanian
206
2
0
16 Dec 2023
CIDR: A Cooperative Integrated Dynamic Refining Method for Minimal Feature Removal Problem
AAAI Conference on Artificial Intelligence (AAAI), 2023
Qian Chen
Tao Zhang
Dongyang Li
Xiaofeng He
410
0
0
13 Dec 2023
TaCo: Targeted Concept Removal in Output Embeddings for NLP via Information Theory and Explainability
Fanny Jourdan
Louis Bethune
Agustin Picard
Laurent Risser
Nicholas M. Asher
182
0
0
11 Dec 2023
Interpretability Illusions in the Generalization of Simplified Models
Dan Friedman
Andrew Kyle Lampinen
Lucas Dixon
Danqi Chen
Asma Ghandeharioun
317
19
0
06 Dec 2023
Interpretable Knowledge Tracing via Response Influence-based Counterfactual Reasoning
IEEE International Conference on Data Engineering (ICDE), 2023
Jiajun Cui
Minghe Yu
Bo Jiang
Aimin Zhou
Jianyong Wang
Wei Zhang
249
7
0
01 Dec 2023
Improving Interpretation Faithfulness for Vision Transformers
International Conference on Machine Learning (ICML), 2023
Lijie Hu
Yixin Liu
Ninghao Liu
Mengdi Huai
Lichao Sun
Haiyan Zhao
203
12
0
29 Nov 2023
Uncertainty in Additive Feature Attribution methods
Abhishek Madaan
Tanya Chowdhury
Neha Rana
James Allan
Tanmoy Chakraborty
175
0
0
29 Nov 2023
FocusLearn: Fully-Interpretable, High-Performance Modular Neural Networks for Time Series
IEEE International Joint Conference on Neural Network (IJCNN), 2023
Qiqi Su
Christos Kloukinas
Artur dÁvila Garcez
AI4TS
317
8
0
28 Nov 2023
Survey on AI Ethics: A Socio-technical Perspective
International Conference on Climate Informatics (ICCI), 2023
Dave Mbiazi
Meghana Bhange
Maryam Babaei
Ivaxi Sheth
Patrik Kenfack
Samira Ebrahimi Kahou
391
9
0
28 Nov 2023
MRxaI: Black-Box Explainability for Image Classifiers in a Medical Setting
Nathan Blake
Hana Chockler
David A. Kelly
Santiago Calderón-Pena
Akchunya Chanchal
114
5
0
24 Nov 2023
On the Relationship Between Interpretability and Explainability in Machine Learning
Benjamin Leblanc
Pascal Germain
FaML
390
1
0
20 Nov 2023
Explainability of Vision Transformers: A Comprehensive Review and New Perspectives
Rojina Kashefi
Leili Barekatain
Mohammad Sabokrou
Fatemeh Aghaeipoor
ViT
262
19
0
12 Nov 2023
SCAAT: Improving Neural Network Interpretability via Saliency Constrained Adaptive Adversarial Training
Rui Xu
Wenkang Qin
Peixiang Huang
Hao Wang
Lin Luo
FAtt
AAML
262
3
0
09 Nov 2023
SmoothHess: ReLU Network Feature Interactions via Stein's Lemma
Neural Information Processing Systems (NeurIPS), 2023
Max Torop
A. Masoomi
Davin Hill
Kivanc Kose
Stratis Ioannidis
Jennifer Dy
320
7
0
01 Nov 2023
Corrupting Neuron Explanations of Deep Visual Features
IEEE International Conference on Computer Vision (ICCV), 2023
Divyansh Srivastava
Tuomas P. Oikarinen
Tsui-Wei Weng
FAtt
AAML
114
3
0
25 Oct 2023
Explanation-based Training with Differentiable Insertion/Deletion Metric-aware Regularizers
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Yuya Yoshikawa
Tomoharu Iwata
223
1
0
19 Oct 2023
Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations
Shiyuan Huang
Siddarth Mamidanna
Shreedhar Jangam
Yilun Zhou
Leilani H. Gilpin
LRM
MILM
ELM
356
107
0
17 Oct 2023
A New Baseline Assumption of Integated Gradients Based on Shaply value
Shuyang Liu
Zixuan Chen
Ge Shi
Ji Wang
Changjie Fan
Yu Xiong
Runze Wu Yujing Hu
Ze Ji
Yang Gao
235
2
0
07 Oct 2023
The Blame Problem in Evaluating Local Explanations, and How to Tackle it
Amir Hossein Akhavan Rahnama
ELM
FAtt
241
7
0
05 Oct 2023
SMOOT: Saliency Guided Mask Optimized Online Training
Ali Karkehabadi
Houman Homayoun
Avesta Sasan
AAML
277
21
0
01 Oct 2023
Counterfactual Image Generation for adversarially robust and interpretable Classifiers
Rafael Bischof
Florian Scheidegger
Michael A. Kraus
A. Malossi
AAML
202
2
0
01 Oct 2023
Faithful Explanations of Black-box NLP Models Using LLM-generated Counterfactuals
International Conference on Learning Representations (ICLR), 2023
Y. Gat
Nitay Calderon
Amir Feder
Alexander Chapanin
Amit Sharma
Roi Reichart
380
42
0
01 Oct 2023
Black-box Attacks on Image Activity Prediction and its Natural Language Explanations
Alina Elena Baia
Valentina Poggioni
Andrea Cavallaro
AAML
192
1
0
30 Sep 2023
DeepROCK: Error-controlled interaction detection in deep neural networks
Winston Chen
William Stafford Noble
Y. Lu
262
1
0
26 Sep 2023
Concept explainability for plant diseases classification
VISIGRAPP (VISIGRAPP), 2023
Jihen Amara
B. König-Ries
Sheeba Samuel
FAtt
98
2
0
15 Sep 2023
Interpretability is in the Mind of the Beholder: A Causal Framework for Human-interpretable Representation Learning
Entropy (Entropy), 2023
Emanuele Marconato
Baptiste Caramiaux
Stefano Teso
264
19
0
14 Sep 2023
Automatic Concept Embedding Model (ACEM): No train-time concepts, No issue!
Rishabh Jain
LRM
106
0
0
07 Sep 2023
Goodhart's Law Applies to NLP's Explanation Benchmarks
Findings (Findings), 2023
Jennifer Hsia
Danish Pruthi
Aarti Singh
Zachary Chase Lipton
203
7
0
28 Aug 2023
On the Interpretability of Quantum Neural Networks
Quantum Machine Intelligence (QMI), 2023
Lirande Pira
C. Ferrie
FAtt
188
29
0
22 Aug 2023
FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare
British medical journal (BMJ), 2023
Karim Lekadir
Aasa Feragen
Abdul Joseph Fofanah
Alejandro F Frangi
Alena Buyx
...
Yi Zeng
Yunusa G Mohammed
Yves Saint James Aquino
Zohaib Salahuddin
M. P. Starmans
AI4TS
297
206
0
11 Aug 2023
A New Perspective on Evaluation Methods for Explainable Artificial Intelligence (XAI)
Timo Speith
Markus Langer
205
16
0
26 Jul 2023
Saliency strikes back: How filtering out high frequencies improves white-box explanations
International Conference on Machine Learning (ICML), 2023
Sabine Muzellec
Thomas Fel
Victor Boutin
Léo Andéol
R. V. Rullen
Thomas Serre
FAtt
415
3
0
18 Jul 2023
SHAMSUL: Systematic Holistic Analysis to investigate Medical Significance Utilizing Local interpretability methods in deep learning for chest radiography pathology prediction
Nordic Machine Intelligence (NMI), 2023
Mahbub Ul Alam
Jaakko Hollmén
Jón R. Baldvinsson
R. Rahmani
FAtt
424
3
0
16 Jul 2023
On the Connection between Game-Theoretic Feature Attributions and Counterfactual Explanations
AAAI/ACM Conference on AI, Ethics, and Society (AIES), 2023
Emanuele Albini
Sanjay Kariyappa
Saumitra Mishra
Danial Dervovic
Daniele Magazzeni
FAtt
250
3
0
13 Jul 2023
Single-Class Target-Specific Attack against Interpretable Deep Learning Systems
Eldor Abdukhamidov
Mohammed Abuhamad
George K. Thiruvathukal
Hyoungshick Kim
Tamer Abuhmed
AAML
187
2
0
12 Jul 2023
Stability Guarantees for Feature Attributions with Multiplicative Smoothing
Neural Information Processing Systems (NeurIPS), 2023
Anton Xue
Rajeev Alur
Eric Wong
313
12
0
12 Jul 2023
Hierarchical Semantic Tree Concept Whitening for Interpretable Image Classification
Haixing Dai
Lu Zhang
Lin Zhao
Zihao Wu
Zheng Liu
...
Yanjun Lyu
Changying Li
Ninghao Liu
Tianming Liu
Dajiang Zhu
238
7
0
10 Jul 2023
Robust Ranking Explanations
Chao Chen
Chenghua Guo
Guixiang Ma
Ming Zeng
Xi Zhang
Sihong Xie
FAtt
AAML
335
0
0
08 Jul 2023
A Vulnerability of Attribution Methods Using Pre-Softmax Scores
Miguel A. Lerma
Mirtha Lucas
FAtt
230
0
0
06 Jul 2023
DARE: Towards Robust Text Explanations in Biomedical and Healthcare Applications
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Adam Ivankay
Mattia Rigotti
P. Frossard
OOD
MedIm
184
1
0
05 Jul 2023
Fixing confirmation bias in feature attribution methods via semantic match
Giovanni Cina
Daniel Fernandez-Llaneza
Ludovico Deponte
Nishant Mishra
Tabea E. Rober
Sandro Pezzelle
Iacer Calixto
Rob Goedhart
cS. .Ilker Birbil
FAtt
229
3
0
03 Jul 2023
SHARCS: Shared Concept Space for Explainable Multimodal Learning
Gabriele Dominici
Pietro Barbiero
Lucie Charlotte Magister
Pietro Lio
Nikola Simidjievski
158
6
0
01 Jul 2023
Verifying Safety of Neural Networks from Topological Perspectives
Science of Computer Programming (SCP), 2023
Zhen Liang
Dejin Ren
Bai Xue
Jing Wang
Wenjing Yang
Wanwei Liu
AAML
212
0
0
27 Jun 2023
Requirements for Explainability and Acceptance of Artificial Intelligence in Collaborative Work
Interacción (HCI), 2023
Sabine Theis
Sophie F. Jentzsch
Fotini Deligiannaki
C. Berro
A. Raulf
C. Bruder
178
13
0
27 Jun 2023
Evaluating the overall sensitivity of saliency-based explanation methods
Harshinee Sriram
Cristina Conati
AAML
XAI
FAtt
150
0
0
21 Jun 2023
On the Robustness of Removal-Based Feature Attributions
Neural Information Processing Systems (NeurIPS), 2023
Christy Lin
Ian Covert
Su-In Lee
361
18
0
12 Jun 2023
A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation
Neural Information Processing Systems (NeurIPS), 2023
Thomas Fel
Victor Boutin
Mazda Moayeri
Rémi Cadène
Louis Bethune
Léo Andéol
Mathieu Chalvidal
Thomas Serre
FAtt
299
83
0
11 Jun 2023
On Minimizing the Impact of Dataset Shifts on Actionable Explanations
Conference on Uncertainty in Artificial Intelligence (UAI), 2023
Anna P. Meyer
Dan Ley
Suraj Srinivas
Himabindu Lakkaraju
FAtt
191
6
0
11 Jun 2023
Previous
1
2
3
4
5
6
...
8
9
10
Next