Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1710.10547
Cited By
v1
v2 (latest)
Interpretation of Neural Networks is Fragile
AAAI Conference on Artificial Intelligence (AAAI), 2017
29 October 2017
Amirata Ghorbani
Abubakar Abid
James Zou
FAtt
AAML
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Interpretation of Neural Networks is Fragile"
50 / 489 papers shown
Title
Towards the Unification and Robustness of Perturbation and Gradient Based Explanations
International Conference on Machine Learning (ICML), 2021
Sushant Agarwal
S. Jabbari
Chirag Agarwal
Sohini Upadhyay
Zhiwei Steven Wu
Himabindu Lakkaraju
FAtt
AAML
271
70
0
21 Feb 2021
The Mind's Eye: Visualizing Class-Agnostic Features of CNNs
International Conference on Information Photonics (ICIP), 2021
Alexandros Stergiou
FAtt
68
4
0
29 Jan 2021
Better sampling in explanation methods can prevent dieselgate-like deception
Domen Vreš
Marko Robnik-Šikonja
AAML
104
11
0
26 Jan 2021
Investigating the significance of adversarial attacks and their relation to interpretability for radar-based human activity recognition systems
Computer Vision and Image Understanding (CVIU), 2021
Utku Ozbulak
Baptist Vandersmissen
A. Jalalvand
Ivo Couckuyt
Arnout Van Messem
W. D. Neve
AAML
81
20
0
26 Jan 2021
Show or Suppress? Managing Input Uncertainty in Machine Learning Model Explanations
Artificial Intelligence (AI), 2021
Danding Wang
Wencan Zhang
Brian Y. Lim
FAtt
93
27
0
23 Jan 2021
How can I choose an explainer? An Application-grounded Evaluation of Post-hoc Explanations
Conference on Fairness, Accountability and Transparency (FAccT), 2021
Sérgio Jesus
Catarina Belém
Vladimir Balayan
João Bento
Pedro Saleiro
P. Bizarro
João Gama
378
129
0
21 Jan 2021
Towards interpreting ML-based automated malware detection models: a survey
Yuzhou Lin
Xiaolin Chang
271
8
0
15 Jan 2021
Explainability of deep vision-based autonomous driving systems: Review and challenges
International Journal of Computer Vision (IJCV), 2021
Éloi Zablocki
H. Ben-younes
P. Pérez
Matthieu Cord
XAI
420
203
0
13 Jan 2021
Enhanced Regularizers for Attributional Robustness
AAAI Conference on Artificial Intelligence (AAAI), 2020
A. Sarkar
Anirban Sarkar
V. Balasubramanian
157
17
0
28 Dec 2020
A Survey on Neural Network Interpretability
IEEE Transactions on Emerging Topics in Computational Intelligence (IEEE TETCI), 2020
Yu Zhang
Peter Tiño
A. Leonardis
Shengcai Liu
FaML
XAI
462
814
0
28 Dec 2020
To what extent do human explanations of model behavior align with actual model behavior?
BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2020
Grusha Prasad
Yixin Nie
Joey Tianyi Zhou
Robin Jia
Douwe Kiela
Adina Williams
183
28
0
24 Dec 2020
Towards Robust Explanations for Deep Neural Networks
Pattern Recognition (Pattern Recognit.), 2020
Ann-Kathrin Dombrowski
Christopher J. Anders
K. Müller
Pan Kessel
FAtt
195
66
0
18 Dec 2020
Robustness to Spurious Correlations in Text Classification via Automatically Generated Counterfactuals
AAAI Conference on Artificial Intelligence (AAAI), 2020
Zhao Wang
A. Culotta
CML
OOD
170
108
0
18 Dec 2020
Debiased-CAM to mitigate image perturbations with faithful visual explanations of machine learning
International Conference on Human Factors in Computing Systems (CHI), 2020
Wencan Zhang
Mariella Dimiccoli
Brian Y. Lim
FAtt
315
19
0
10 Dec 2020
Understanding Interpretability by generalized distillation in Supervised Classification
Adit Agarwal
Dr. K.K. Shukla
Arjan Kuijper
Anirban Mukhopadhyay
FaML
FAtt
128
0
0
05 Dec 2020
LRTA: A Transparent Neural-Symbolic Reasoning Framework with Modular Supervision for Visual Question Answering
Weixin Liang
Fei Niu
Aishwarya N. Reganti
Govind Thattai
Gokhan Tur
163
19
0
21 Nov 2020
Backdoor Attacks on the DNN Interpretation System
AAAI Conference on Artificial Intelligence (AAAI), 2020
Shihong Fang
A. Choromańska
FAtt
AAML
196
22
0
21 Nov 2020
One Explanation is Not Enough: Structured Attention Graphs for Image Classification
Neural Information Processing Systems (NeurIPS), 2020
Vivswan Shitole
Li Fuxin
Minsuk Kahng
Prasad Tadepalli
Alan Fern
FAtt
GNN
281
45
0
13 Nov 2020
Robust and Stable Black Box Explanations
International Conference on Machine Learning (ICML), 2020
Himabindu Lakkaraju
Nino Arsov
Osbert Bastani
AAML
FAtt
160
91
0
12 Nov 2020
Debugging Tests for Model Explanations
Julius Adebayo
M. Muelly
Ilaria Liccardi
Been Kim
FAtt
266
198
0
10 Nov 2020
Benchmarking Deep Learning Interpretability in Time Series Predictions
Neural Information Processing Systems (NeurIPS), 2020
Aya Abdelsalam Ismail
Mohamed K. Gunady
H. C. Bravo
Soheil Feizi
XAI
AI4TS
FAtt
275
203
0
26 Oct 2020
Measuring Association Between Labels and Free-Text Rationales
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Sarah Wiegreffe
Ana Marasović
Noah A. Smith
632
194
0
24 Oct 2020
Exemplary Natural Images Explain CNN Activations Better than State-of-the-Art Feature Visualization
Judy Borowski
Roland S. Zimmermann
Judith Schepers
Robert Geirhos
Thomas S. A. Wallis
Matthias Bethge
Wieland Brendel
FAtt
209
7
0
23 Oct 2020
Optimism in the Face of Adversity: Understanding and Improving Deep Learning through Adversarial Robustness
Proceedings of the IEEE (Proc. IEEE), 2020
Guillermo Ortiz-Jiménez
Apostolos Modas
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
AAML
320
50
0
19 Oct 2020
Evaluating Attribution Methods using White-Box LSTMs
BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2020
Sophie Hao
FAtt
XAI
181
8
0
16 Oct 2020
FAR: A General Framework for Attributional Robustness
Adam Ivankay
Ivan Girardi
Chiara Marchiori
P. Frossard
OOD
220
22
0
14 Oct 2020
Learning Propagation Rules for Attribution Map Generation
Yiding Yang
Jiayan Qiu
Xiuming Zhang
Dacheng Tao
Xinchao Wang
FAtt
140
17
0
14 Oct 2020
Neural Gaussian Mirror for Controlled Feature Selection in Neural Networks
Xin Xing
Yu Gui
Chenguang Dai
Jun S. Liu
AAML
140
6
0
13 Oct 2020
Gradient-based Analysis of NLP Models is Manipulable
Junlin Wang
Jens Tuyls
Eric Wallace
Sameer Singh
AAML
FAtt
164
62
0
12 Oct 2020
Explaining Clinical Decision Support Systems in Medical Imaging using Cycle-Consistent Activation Maximization
Neurocomputing (Neurocomputing), 2020
Alexander Katzmann
O. Taubmann
Stephen Ahmad
Alexander Muhlberg
M. Sühling
H. Groß
MedIm
135
26
0
09 Oct 2020
A survey of algorithmic recourse: definitions, formulations, solutions, and prospects
Amir-Hossein Karimi
Gilles Barthe
Bernhard Schölkopf
Isabel Valera
FaML
327
182
0
08 Oct 2020
Information-Theoretic Visual Explanation for Black-Box Classifiers
Jihun Yi
Eunji Kim
Siwon Kim
Sungroh Yoon
FAtt
168
6
0
23 Sep 2020
What Do You See? Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors
Knowledge Discovery and Data Mining (KDD), 2020
Yi-Shan Lin
Wen-Chuan Lee
Z. Berkay Celik
XAI
191
105
0
22 Sep 2020
Reconstructing Actions To Explain Deep Reinforcement Learning
Xuan Chen
Zifan Wang
Yucai Fan
Bonan Jin
Piotr (Peter) Mardziel
Carlee Joe-Wong
Anupam Datta
FAtt
239
2
0
17 Sep 2020
Evolutionary Selective Imitation: Interpretable Agents by Imitation Learning Without a Demonstrator
Roy Eliya
J. Herrmann
107
2
0
17 Sep 2020
Are Interpretations Fairly Evaluated? A Definition Driven Pipeline for Post-Hoc Interpretability
Ninghao Liu
Yunsong Meng
Helen Zhou
Tie Wang
Bo Long
XAI
FAtt
167
7
0
16 Sep 2020
How Good is your Explanation? Algorithmic Stability Measures to Assess the Quality of Explanations for Deep Neural Networks
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
Thomas Fel
David Vigouroux
Rémi Cadène
Thomas Serre
XAI
FAtt
365
34
0
07 Sep 2020
Model extraction from counterfactual explanations
Ulrich Aïvodji
Alexandre Bolot
Sébastien Gambs
MIACV
MLAU
189
58
0
03 Sep 2020
Relevance Attack on Detectors
Sizhe Chen
Fan He
Xiaolin Huang
Kun Zhang
AAML
204
18
0
16 Aug 2020
Can We Trust Your Explanations? Sanity Checks for Interpreters in Android Malware Analysis
Ming Fan
Wenying Wei
Xiaofei Xie
Yang Liu
X. Guan
Ting Liu
FAtt
AAML
284
43
0
13 Aug 2020
Reliable Post hoc Explanations: Modeling Uncertainty in Explainability
Neural Information Processing Systems (NeurIPS), 2020
Dylan Slack
Sophie Hilgard
Sameer Singh
Himabindu Lakkaraju
FAtt
523
200
0
11 Aug 2020
Fairwashing Explanations with Off-Manifold Detergent
Christopher J. Anders
Plamen Pasliev
Ann-Kathrin Dombrowski
K. Müller
Pan Kessel
FAtt
FaML
140
103
0
20 Jul 2020
A simple defense against adversarial attacks on heatmap explanations
Laura Rieger
Lars Kai Hansen
FAtt
AAML
208
39
0
13 Jul 2020
Regional Image Perturbation Reduces
L
p
L_p
L
p
Norms of Adversarial Examples While Maintaining Model-to-model Transferability
Utku Ozbulak
Jonathan Peck
W. D. Neve
Bart Goossens
Yvan Saeys
Arnout Van Messem
AAML
124
2
0
07 Jul 2020
Unifying Model Explainability and Robustness via Machine-Checkable Concepts
Vedant Nanda
Till Speicher
John P. Dickerson
Krishna P. Gummadi
Muhammad Bilal Zafar
AAML
163
4
0
01 Jul 2020
Proper Network Interpretability Helps Adversarial Robustness in Classification
Akhilan Boopathy
Sijia Liu
Gaoyuan Zhang
Cynthia Liu
Pin-Yu Chen
Shiyu Chang
Luca Daniel
AAML
FAtt
213
72
0
26 Jun 2020
Influence Functions in Deep Learning Are Fragile
S. Basu
Phillip E. Pope
Soheil Feizi
TDI
385
281
0
25 Jun 2020
Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey
Arun Das
P. Rad
XAI
386
696
0
16 Jun 2020
On Saliency Maps and Adversarial Robustness
Puneet Mangla
Vedant Singh
V. Balasubramanian
AAML
170
18
0
14 Jun 2020
Explainable Artificial Intelligence: a Systematic Review
Giulia Vilone
Luca Longo
XAI
551
300
0
29 May 2020
Previous
1
2
3
...
10
7
8
9
Next