Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1710.10547
Cited By
v1
v2 (latest)
Interpretation of Neural Networks is Fragile
AAAI Conference on Artificial Intelligence (AAAI), 2017
29 October 2017
Amirata Ghorbani
Abubakar Abid
James Zou
FAtt
AAML
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Interpretation of Neural Networks is Fragile"
50 / 489 papers shown
Title
Beyond Model Interpretability: Socio-Structural Explanations in Machine Learning
Ai & Society (AS), 2024
Andrew Smart
Atoosa Kasirzadeh
267
10
0
05 Sep 2024
Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance
Knowledge Discovery and Data Mining (KDD), 2024
Thomas Decker
Alexander Koebler
Michael Lebacher
Ingo Thon
Volker Tresp
Florian Buettner
244
2
0
24 Aug 2024
Improving Network Interpretability via Explanation Consistency Evaluation
IEEE transactions on multimedia (IEEE TMM), 2024
Hefeng Wu
Hao Jiang
Keze Wang
Ziyi Tang
Xianghuan He
Liang Lin
FAtt
AAML
285
3
0
08 Aug 2024
Joint Universal Adversarial Perturbations with Interpretations
Liang-bo Ning
Zeyu Dai
Wenqi Fan
Jingran Su
Chao Pan
Luning Wang
Qing Li
AAML
290
3
0
03 Aug 2024
Algebraic Adversarial Attacks on Integrated Gradients
Lachlan Simpson
Federico Costanza
Kyle Millar
A. Cheng
Cheng-Chew Lim
Hong-Gunn Chew
SILM
AAML
400
4
0
23 Jul 2024
Interpretable Concept-Based Memory Reasoning
David Debot
Pietro Barbiero
Francesco Giannini
Gabriele Ciravegna
Michelangelo Diligenti
Giuseppe Marra
LRM
285
13
0
22 Jul 2024
Auditing Local Explanations is Hard
Robi Bhattacharjee
U. V. Luxburg
LRM
MLAU
FAtt
235
5
0
18 Jul 2024
Understanding Visual Feature Reliance through the Lens of Complexity
Thomas Fel
Louis Bethune
Andrew Kyle Lampinen
Thomas Serre
Katherine Hermann
FAtt
CoGe
243
14
0
08 Jul 2024
CAT: Interpretable Concept-based Taylor Additive Models
Viet Duong
Qiong Wu
Zhengyi Zhou
Hongjue Zhao
Chenxiang Luo
Eric Zavesky
Huaxiu Yao
Huajie Shao
FAtt
284
3
0
25 Jun 2024
Linearly-Interpretable Concept Embedding Models for Text Analysis
Francesco De Santis
Philippe Bich
Gabriele Ciravegna
Pietro Barbiero
Danilo Giordano
Tania Cerquitelli
279
1
0
20 Jun 2024
ProtoS-ViT: Visual foundation models for sparse self-explainable classifications
Hugues Turbé
Mina Bjelogrlic
G. Mengaldo
Christian Lovis
ViT
281
8
0
14 Jun 2024
On the Robustness of Global Feature Effect Explanations
Hubert Baniecki
Giuseppe Casalicchio
B. Bischl
P. Biecek
300
4
0
13 Jun 2024
Explainable Graph Neural Networks Under Fire
Zhong Li
Simon Geisler
Yuhang Wang
Stephan Günnemann
M. Leeuwen
AAML
255
3
0
10 Jun 2024
Provably Better Explanations with Optimized Aggregation of Feature Attributions
International Conference on Machine Learning (ICML), 2024
Thomas Decker
Ananta R. Bhattarai
Jindong Gu
Volker Tresp
Florian Buettner
203
6
0
07 Jun 2024
Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions
International Conference on Machine Learning (ICML), 2024
Jingtan Wang
Xiaoqiang Lin
Rui Qiao
Chuan-Sheng Foo
Bryan Kian Hsiang Low
TDI
184
9
0
07 Jun 2024
Expected Grad-CAM: Towards gradient faithfulness
Vincenzo Buono
Peyman Sheikholharam Mashhadi
M. Rahat
Prayag Tiwari
Stefan Byttner
FAtt
243
3
0
03 Jun 2024
Interpretable Prognostics with Concept Bottleneck Models
Florent Forest
Katharina Rombach
Olga Fink
205
10
0
27 May 2024
AnyCBMs: How to Turn Any Black Box into a Concept Bottleneck Model
Gabriele Dominici
Pietro Barbiero
Francesco Giannini
M. Gjoreski
Marc Langhenirich
204
5
0
26 May 2024
Exposing Image Classifier Shortcuts with Counterfactual Frequency (CoF) Tables
James Hinns
David Martens
291
4
0
24 May 2024
Why do explanations fail? A typology and discussion on failures in XAI
Clara Bove
Thibault Laugel
Marie-Jeanne Lesot
C. Tijus
Marcin Detyniecki
237
8
0
22 May 2024
Safety in Graph Machine Learning: Threats and Safeguards
Song Wang
Yushun Dong
Binchi Zhang
Zihan Chen
Xingbo Fu
Yinhan He
Cong Shen
Chuxu Zhang
Nitesh Chawla
Wenlin Yao
264
11
0
17 May 2024
Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution
International Conference on Machine Learning (ICML), 2024
Eslam Zaher
Maciej Trzaskowski
Quan Nguyen
Fred Roosta
AAML
244
8
0
16 May 2024
Beyond the Black Box: Do More Complex Models Provide Superior XAI Explanations?
Mateusz Cedro
Marcin Chlebus
269
1
0
14 May 2024
Certified
ℓ
2
\ell_2
ℓ
2
Attribution Robustness via Uniformly Smoothed Attributions
Fan Wang
Adams Wai-Kin Kong
205
2
0
10 May 2024
Learned feature representations are biased by complexity, learning order, position, and more
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Katherine Hermann
AI4CE
FaML
SSL
OOD
255
18
0
09 May 2024
Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Watermarking Feature Attribution
Shuo Shao
Yiming Li
Hongwei Yao
Yiling He
Zhan Qin
Kui Ren
234
31
0
08 May 2024
Stability of Explainable Recommendation
ACM Conference on Recommender Systems (RecSys), 2023
Sairamvinay Vijayaraghavan
Prasant Mohapatra
AAML
245
4
0
03 May 2024
Why You Should Not Trust Interpretations in Machine Learning: Adversarial Attacks on Partial Dependence Plots
Xi Xin
Giles Hooker
Fei Huang
AAML
258
8
0
29 Apr 2024
SIDEs: Separating Idealization from Deceptive Explanations in xAI
Emily Sullivan
226
4
0
25 Apr 2024
Deep Neural Networks via Complex Network Theory: a Perspective
Emanuele La Malfa
G. Malfa
Giuseppe Nicosia
Vito Latora
GNN
212
5
0
17 Apr 2024
Explainable Generative AI (GenXAI): A Survey, Conceptualization, and Research Agenda
Johannes Schneider
242
73
0
15 Apr 2024
Exploring Explainability in Video Action Recognition
Avinab Saha
Shashank Gupta
S. Ankireddy
Karl Chahine
Joydeep Ghosh
89
7
0
13 Apr 2024
PASA: Attack Agnostic Unsupervised Adversarial Detection using Prediction & Attribution Sensitivity Analysis
Dipkamal Bhusal
Md Tanvirul Alam
M. K. Veerabhadran
Michael Clifford
Sara Rampazzi
Nidhi Rastogi
AAML
214
5
0
12 Apr 2024
Structured Gradient-based Interpretations via Norm-Regularized Adversarial Training
Shizhan Gong
Qi Dou
Farzan Farnia
FAtt
221
5
0
06 Apr 2024
Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models
Computer Vision and Pattern Recognition (CVPR), 2024
M. Kowal
Richard P. Wildes
Konstantinos G. Derpanis
GNN
283
14
0
02 Apr 2024
CAM-Based Methods Can See through Walls
Magamed Taimeskhanov
R. Sicre
Damien Garreau
239
3
0
02 Apr 2024
Evaluating Explanatory Capabilities of Machine Learning Models in Medical Diagnostics: A Human-in-the-Loop Approach
José Bobes-Bascarán
E. Mosqueira-Rey
Á. Fernández-Leal
Elena Hernández-Pereira
David Alonso-Ríos
V. Moret-Bonillo
Israel Figueirido-Arnoso
Y. Vidal-Ínsua
ELM
158
0
0
28 Mar 2024
The Anatomy of Adversarial Attacks: Concept-based XAI Dissection
Georgii Mikriukov
Gesina Schwalbe
Franz Motzkus
Korinna Bade
AAML
203
1
0
25 Mar 2024
Uncertainty-Aware Explanations Through Probabilistic Self-Explainable Neural Networks
Jon Vadillo
Roberto Santana
J. A. Lozano
Marta Z. Kwiatkowska
AAML
BDL
491
1
0
20 Mar 2024
Gradient based Feature Attribution in Explainable AI: A Technical Review
Yongjie Wang
Tong Zhang
Xu Guo
Zhiqi Shen
XAI
254
40
0
15 Mar 2024
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
Haoyang Liu
Aditya Singh
Yijiang Li
Haohan Wang
AAML
ViT
333
1
0
15 Mar 2024
Towards White Box Deep Learning
Maciej Satkiewicz
AAML
430
1
0
14 Mar 2024
Are Classification Robustness and Explanation Robustness Really Strongly Correlated? An Analysis Through Input Loss Landscape
Tiejin Chen
Wenwang Huang
Linsey Pang
Dongsheng Luo
Hua Wei
OOD
222
0
0
09 Mar 2024
XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques
Yu Xiong
Zhipeng Hu
Ye Huang
Runze Wu
Kai Guan
...
Tianze Zhou
Yujing Hu
Haoyu Liu
Tangjie Lyu
Changjie Fan
OffRL
356
2
0
20 Feb 2024
Probabilistic Shapley Value Modeling and Inference
Mert Ketenci
Inigo Urteaga
Victor Alfonso Rodriguez
Noémie Elhadad
A. Perotte
FAtt
305
0
0
06 Feb 2024
Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution
Neural Information Processing Systems (NeurIPS), 2024
Ian Covert
Chanwoo Kim
Su-In Lee
James Zou
Tatsunori Hashimoto
TDI
313
14
0
29 Jan 2024
Respect the model: Fine-grained and Robust Explanation with Sharing Ratio Decomposition
International Conference on Learning Representations (ICLR), 2024
Sangyu Han
Yearim Kim
Nojun Kwak
AAML
219
4
0
25 Jan 2024
AttributionScanner: A Visual Analytics System for Model Validation with Metadata-Free Slice Finding
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2024
Xiwei Xuan
Jorge Henrique Piazentin Ono
Liang Gou
Kwan-Liu Ma
Liu Ren
244
7
0
12 Jan 2024
Manipulating Feature Visualizations with Gradient Slingshots
Dilyara Bareeva
Marina M.-C. Höhne
Alexander Warnecke
Lukas Pirch
Klaus-Robert Müller
Konrad Rieck
Sebastian Lapuschkin
Kirill Bykov
AAML
371
6
0
11 Jan 2024
Towards Faithful Explanations for Text Classification with Robustness Improvement and Explanation Guided Training
Dongfang Li
Baotian Hu
Qingcai Chen
Shan He
279
8
0
29 Dec 2023
Previous
1
2
3
4
5
...
8
9
10
Next