ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1710.10547
  4. Cited By
Interpretation of Neural Networks is Fragile
v1v2 (latest)

Interpretation of Neural Networks is Fragile

AAAI Conference on Artificial Intelligence (AAAI), 2017
29 October 2017
Amirata Ghorbani
Abubakar Abid
James Zou
    FAttAAML
ArXiv (abs)PDFHTML

Papers citing "Interpretation of Neural Networks is Fragile"

39 / 489 papers shown
Title
Evaluating Explanation Methods for Deep Learning in Security
Evaluating Explanation Methods for Deep Learning in SecurityEuropean Symposium on Security and Privacy (EuroS&P), 2019
Alexander Warnecke
Dan Arp
Christian Wressnegger
Konrad Rieck
XAIAAMLFAtt
239
117
0
05 Jun 2019
Adversarial Robustness as a Prior for Learned Representations
Adversarial Robustness as a Prior for Learned Representations
Logan Engstrom
Andrew Ilyas
Shibani Santurkar
Dimitris Tsipras
Brandon Tran
Aleksander Madry
OODAAML
216
63
0
03 Jun 2019
Certifiably Robust Interpretation in Deep Learning
Certifiably Robust Interpretation in Deep Learning
Alexander Levine
Sahil Singla
Soheil Feizi
FAttAAML
315
65
0
28 May 2019
Analyzing the Interpretability Robustness of Self-Explaining Models
Analyzing the Interpretability Robustness of Self-Explaining Models
Haizhong Zheng
Earlence Fernandes
A. Prakash
AAMLLRM
168
7
0
27 May 2019
Robust Attribution Regularization
Robust Attribution RegularizationNeural Information Processing Systems (NeurIPS), 2019
Jiefeng Chen
Xi Wu
Vaibhav Rastogi
Yingyu Liang
S. Jha
OOD
172
88
0
23 May 2019
What Do Adversarially Robust Models Look At?
What Do Adversarially Robust Models Look At?
Takahiro Itazuri
Yoshihiro Fukuhara
Hirokatsu Kataoka
Shigeo Morishima
101
5
0
19 May 2019
Misleading Failures of Partial-input Baselines
Misleading Failures of Partial-input Baselines
Shi Feng
Eric Wallace
Jordan L. Boyd-Graber
211
0
0
14 May 2019
Interpreting Adversarial Examples with Attributes
Interpreting Adversarial Examples with Attributes
Sadaf Gulshad
J. H. Metzen
A. Smeulders
Zeynep Akata
FAttAAML
169
6
0
17 Apr 2019
HARK Side of Deep Learning -- From Grad Student Descent to Automated
  Machine Learning
HARK Side of Deep Learning -- From Grad Student Descent to Automated Machine Learning
O. Gencoglu
M. Gils
E. Guldogan
Chamin Morikawa
Mehmet Süzen
M. Gruber
J. Leinonen
H. Huttunen
145
38
0
16 Apr 2019
Data Shapley: Equitable Valuation of Data for Machine Learning
Data Shapley: Equitable Valuation of Data for Machine Learning
Amirata Ghorbani
James Zou
TDIFedML
496
953
0
05 Apr 2019
Explaining Deep Neural Networks with a Polynomial Time Algorithm for
  Shapley Values Approximation
Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Values Approximation
Marco Ancona
Cengiz Öztireli
Markus Gross
FAttTDI
280
253
0
26 Mar 2019
Interpreting Neural Networks Using Flip Points
Interpreting Neural Networks Using Flip Points
Roozbeh Yousefzadeh
D. O’Leary
AAMLFAtt
117
18
0
21 Mar 2019
Attribution-driven Causal Analysis for Detection of Adversarial Examples
Attribution-driven Causal Analysis for Detection of Adversarial Examples
Susmit Jha
Sunny Raj
S. Fernandes
Sumit Kumar Jha
S. Jha
Gunjan Verma
B. Jalaeian
A. Swami
AAML
149
17
0
14 Mar 2019
Aggregating explanation methods for stable and robust explainability
Aggregating explanation methods for stable and robust explainability
Laura Rieger
Lars Kai Hansen
AAMLFAtt
284
15
0
01 Mar 2019
Functional Transparency for Structured Data: a Game-Theoretic Approach
Functional Transparency for Structured Data: a Game-Theoretic ApproachInternational Conference on Machine Learning (ICML), 2019
Guang-He Lee
Wengong Jin
David Alvarez-Melis
Tommi Jaakkola
148
19
0
26 Feb 2019
Seven Myths in Machine Learning Research
Seven Myths in Machine Learning Research
Oscar Chang
Hod Lipson
53
0
0
18 Feb 2019
Regularizing Black-box Models for Improved Interpretability
Regularizing Black-box Models for Improved Interpretability
Gregory Plumb
Maruan Al-Shedivat
Ángel Alexander Cabrera
Adam Perer
Eric Xing
Ameet Talwalkar
AAML
508
83
0
18 Feb 2019
Towards Automatic Concept-based Explanations
Towards Automatic Concept-based Explanations
Amirata Ghorbani
James Wexler
James Zou
Been Kim
FAttLRM
321
20
0
07 Feb 2019
Fooling Neural Network Interpretations via Adversarial Model
  Manipulation
Fooling Neural Network Interpretations via Adversarial Model Manipulation
Juyeon Heo
Sunghwan Joo
Taesup Moon
AAMLFAtt
373
223
0
06 Feb 2019
Understanding Impacts of High-Order Loss Approximations and Features in
  Deep Learning Interpretation
Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning InterpretationInternational Conference on Machine Learning (ICML), 2019
Sahil Singla
Eric Wallace
Shi Feng
Soheil Feizi
FAtt
207
60
0
01 Feb 2019
Interpreting Deep Neural Networks Through Variable Importance
Interpreting Deep Neural Networks Through Variable Importance
J. Ish-Horowicz
Dana Udwin
Seth Flaxman
Sarah Filippi
Lorin Crawford
FAtt
133
16
0
28 Jan 2019
On the (In)fidelity and Sensitivity for Explanations
On the (In)fidelity and Sensitivity for Explanations
Chih-Kuan Yeh
Cheng-Yu Hsieh
A. Suggala
David I. Inouye
Pradeep Ravikumar
FAtt
327
523
0
27 Jan 2019
Fooling Network Interpretation in Image Classification
Fooling Network Interpretation in Image Classification
Akshayvarun Subramanya
Vipin Pillai
Hamed Pirsiavash
AAMLFAtt
201
7
0
06 Dec 2018
Compensated Integrated Gradients to Reliably Interpret EEG
  Classification
Compensated Integrated Gradients to Reliably Interpret EEG Classification
Kazuki Tachikawa
Yuji Kawai
Jihoon Park
Minoru Asada
FAtt
114
1
0
21 Nov 2018
What can AI do for me: Evaluating Machine Learning Interpretations in
  Cooperative Play
What can AI do for me: Evaluating Machine Learning Interpretations in Cooperative Play
Shi Feng
Jordan L. Boyd-Graber
HAI
260
139
0
23 Oct 2018
Local Explanation Methods for Deep Neural Networks Lack Sensitivity to
  Parameter Values
Local Explanation Methods for Deep Neural Networks Lack Sensitivity to Parameter Values
Julius Adebayo
Justin Gilmer
Ian Goodfellow
Been Kim
FAttAAML
189
131
0
08 Oct 2018
Sanity Checks for Saliency Maps
Sanity Checks for Saliency Maps
Julius Adebayo
Justin Gilmer
M. Muelly
Ian Goodfellow
Moritz Hardt
Been Kim
FAttAAMLXAI
1.1K
2,202
0
08 Oct 2018
Training Machine Learning Models by Regularizing their Explanations
Training Machine Learning Models by Regularizing their Explanations
A. Ross
FaML
117
0
0
29 Sep 2018
Interpreting Neural Networks With Nearest Neighbors
Interpreting Neural Networks With Nearest Neighbors
Eric Wallace
Shi Feng
Jordan L. Boyd-Graber
AAMLFAttMILM
295
56
0
08 Sep 2018
DeepPINK: reproducible feature selection in deep neural networks
DeepPINK: reproducible feature selection in deep neural networks
Yang Young Lu
Yingying Fan
Jinchi Lv
William Stafford Noble
FAtt
356
137
0
04 Sep 2018
Knockoffs for the mass: new feature importance statistics with false
  discovery guarantees
Knockoffs for the mass: new feature importance statistics with false discovery guaranteesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2018
Jaime Roquero Gimenez
Amirata Ghorbani
James Zou
CML
153
57
0
17 Jul 2018
Adversarial Reprogramming of Neural Networks
Adversarial Reprogramming of Neural NetworksInternational Conference on Learning Representations (ICLR), 2018
Gamaleldin F. Elsayed
Ian Goodfellow
Jascha Narain Sohl-Dickstein
OODAAML
174
196
0
28 Jun 2018
Gradient Similarity: An Explainable Approach to Detect Adversarial
  Attacks against Deep Learning
Gradient Similarity: An Explainable Approach to Detect Adversarial Attacks against Deep Learning
J. Dhaliwal
S. Shintre
AAML
83
16
0
27 Jun 2018
Interpreting Deep Learning: The Machine Learning Rorschach Test?
Interpreting Deep Learning: The Machine Learning Rorschach Test?
Adam S. Charles
AAMLHAIAI4CE
194
9
0
01 Jun 2018
Pathologies of Neural Models Make Interpretations Difficult
Pathologies of Neural Models Make Interpretations DifficultConference on Empirical Methods in Natural Language Processing (EMNLP), 2018
Shi Feng
Eric Wallace
Alvin Grissom II
Mohit Iyyer
Pedro Rodriguez
Jordan L. Boyd-Graber
AAMLFAtt
300
330
0
20 Apr 2018
Towards Explanation of DNN-based Prediction with Guided Feature
  Inversion
Towards Explanation of DNN-based Prediction with Guided Feature Inversion
Mengnan Du
Ninghao Liu
Qingquan Song
Helen Zhou
FAtt
361
131
0
19 Mar 2018
Exact and Consistent Interpretation for Piecewise Linear Neural
  Networks: A Closed Form Solution
Exact and Consistent Interpretation for Piecewise Linear Neural Networks: A Closed Form Solution
Lingyang Chu
X. Hu
Juhua Hu
Lanjun Wang
Jian Pei
146
104
0
17 Feb 2018
Deep Neural Generative Model of Functional MRI Images for Psychiatric
  Disorder Diagnosis
Deep Neural Generative Model of Functional MRI Images for Psychiatric Disorder Diagnosis
Takashi Matsubara
T. Tashiro
K. Uehara
MedIm
101
45
0
18 Dec 2017
Interpretability Beyond Feature Attribution: Quantitative Testing with
  Concept Activation Vectors (TCAV)
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)
Been Kim
Martin Wattenberg
Justin Gilmer
Carrie J. Cai
James Wexler
F. Viégas
Rory Sayres
FAtt
492
2,095
0
30 Nov 2017
Previous
123...1089