Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1710.10547
Cited By
Interpretation of Neural Networks is Fragile
29 October 2017
Amirata Ghorbani
Abubakar Abid
James Y. Zou
FAtt
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Interpretation of Neural Networks is Fragile"
50 / 467 papers shown
Title
Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution
Eslam Zaher
Maciej Trzaskowski
Quan Nguyen
Fred Roosta
AAML
24
4
0
16 May 2024
Beyond the Black Box: Do More Complex Models Provide Superior XAI Explanations?
Mateusz Cedro
Marcin Chlebus
35
1
0
14 May 2024
Certified
ℓ
2
\ell_2
ℓ
2
Attribution Robustness via Uniformly Smoothed Attributions
Fan Wang
Adams Wai-Kin Kong
43
1
0
10 May 2024
Learned feature representations are biased by complexity, learning order, position, and more
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Katherine Hermann
AI4CE
FaML
SSL
OOD
34
6
0
09 May 2024
Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Watermarking Feature Attribution
Shuo Shao
Yiming Li
Hongwei Yao
Yiling He
Zhan Qin
Kui Ren
32
14
0
08 May 2024
Stability of Explainable Recommendation
Sairamvinay Vijayaraghavan
Prasant Mohapatra
AAML
38
1
0
03 May 2024
Why You Should Not Trust Interpretations in Machine Learning: Adversarial Attacks on Partial Dependence Plots
Xi Xin
Giles Hooker
Fei Huang
AAML
38
6
0
29 Apr 2024
SIDEs: Separating Idealization from Deceptive Explanations in xAI
Emily Sullivan
49
2
0
25 Apr 2024
Deep Neural Networks via Complex Network Theory: a Perspective
Emanuele La Malfa
G. Malfa
Giuseppe Nicosia
Vito Latora
GNN
27
2
0
17 Apr 2024
Explainable Generative AI (GenXAI): A Survey, Conceptualization, and Research Agenda
Johannes Schneider
83
26
0
15 Apr 2024
Exploring Explainability in Video Action Recognition
Avinab Saha
Shashank Gupta
S. Ankireddy
Karl Chahine
Joydeep Ghosh
30
0
0
13 Apr 2024
PASA: Attack Agnostic Unsupervised Adversarial Detection using Prediction & Attribution Sensitivity Analysis
Dipkamal Bhusal
Md Tanvirul Alam
M. K. Veerabhadran
Michael Clifford
Sara Rampazzi
Nidhi Rastogi
AAML
43
1
0
12 Apr 2024
Structured Gradient-based Interpretations via Norm-Regularized Adversarial Training
Shizhan Gong
Qi Dou
Farzan Farnia
FAtt
40
2
0
06 Apr 2024
Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models
M. Kowal
Richard P. Wildes
Konstantinos G. Derpanis
GNN
30
8
0
02 Apr 2024
CAM-Based Methods Can See through Walls
Magamed Taimeskhanov
R. Sicre
Damien Garreau
21
1
0
02 Apr 2024
Evaluating Explanatory Capabilities of Machine Learning Models in Medical Diagnostics: A Human-in-the-Loop Approach
José Bobes-Bascarán
E. Mosqueira-Rey
Á. Fernández-Leal
Elena Hernández-Pereira
David Alonso-Ríos
V. Moret-Bonillo
Israel Figueirido-Arnoso
Y. Vidal-Ínsua
ELM
27
0
0
28 Mar 2024
The Anatomy of Adversarial Attacks: Concept-based XAI Dissection
Georgii Mikriukov
Gesina Schwalbe
Franz Motzkus
Korinna Bade
AAML
24
1
0
25 Mar 2024
Gradient based Feature Attribution in Explainable AI: A Technical Review
Yongjie Wang
Tong Zhang
Xu Guo
Zhiqi Shen
XAI
19
18
0
15 Mar 2024
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
Haoyang Liu
Aditya Singh
Yijiang Li
Haohan Wang
AAML
ViT
36
1
0
15 Mar 2024
Towards White Box Deep Learning
Maciej Satkiewicz
AAML
34
1
0
14 Mar 2024
Are Classification Robustness and Explanation Robustness Really Strongly Correlated? An Analysis Through Input Loss Landscape
Tiejin Chen
Wenwang Huang
Linsey Pang
Dongsheng Luo
Hua Wei
OOD
43
0
0
09 Mar 2024
XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques
Yu Xiong
Zhipeng Hu
Ye Huang
Runze Wu
Kai Guan
...
Tianze Zhou
Yujing Hu
Haoyu Liu
Tangjie Lyu
Changjie Fan
OffRL
62
1
0
20 Feb 2024
Variational Shapley Network: A Probabilistic Approach to Self-Explaining Shapley values with Uncertainty Quantification
Mert Ketenci
Inigo Urteaga
Victor Alfonso Rodriguez
Noémie Elhadad
A. Perotte
FAtt
17
0
0
06 Feb 2024
Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution
Ian Covert
Chanwoo Kim
Su-In Lee
James Y. Zou
Tatsunori Hashimoto
TDI
29
7
0
29 Jan 2024
Respect the model: Fine-grained and Robust Explanation with Sharing Ratio Decomposition
Sangyu Han
Yearim Kim
Nojun Kwak
AAML
23
1
0
25 Jan 2024
AttributionScanner: A Visual Analytics System for Model Validation with Metadata-Free Slice Finding
Xiwei Xuan
Jorge Henrique Piazentin Ono
Liang Gou
Kwan-Liu Ma
Liu Ren
53
1
0
12 Jan 2024
Manipulating Feature Visualizations with Gradient Slingshots
Dilyara Bareeva
Marina M.-C. Höhne
Alexander Warnecke
Lukas Pirch
Klaus-Robert Müller
Konrad Rieck
Kirill Bykov
AAML
37
6
0
11 Jan 2024
Towards Faithful Explanations for Text Classification with Robustness Improvement and Explanation Guided Training
Dongfang Li
Baotian Hu
Qingcai Chen
Shan He
26
4
0
29 Dec 2023
Concept-based Explainable Artificial Intelligence: A Survey
Eleonora Poeta
Gabriele Ciravegna
Eliana Pastor
Tania Cerquitelli
Elena Baralis
LRM
XAI
21
41
0
20 Dec 2023
CEIR: Concept-based Explainable Image Representation Learning
Yan Cui
Shuhong Liu
Liuzhuozheng Li
Zhiyuan Yuan
SSL
VLM
26
3
0
17 Dec 2023
Rethinking Robustness of Model Attributions
Sandesh Kamath
Sankalp Mittal
Amit Deshpande
Vineeth N. Balasubramanian
20
2
0
16 Dec 2023
CIDR: A Cooperative Integrated Dynamic Refining Method for Minimal Feature Removal Problem
Qian Chen
Tao Zhang
Dongyang Li
Xiaofeng He
26
0
0
13 Dec 2023
Interpretability Illusions in the Generalization of Simplified Models
Dan Friedman
Andrew Kyle Lampinen
Lucas Dixon
Danqi Chen
Asma Ghandeharioun
17
14
0
06 Dec 2023
Interpretable Knowledge Tracing via Response Influence-based Counterfactual Reasoning
Jiajun Cui
Minghe Yu
Bo Jiang
Aimin Zhou
Jianyong Wang
Wei Zhang
29
2
0
01 Dec 2023
Improving Interpretation Faithfulness for Vision Transformers
Lijie Hu
Yixin Liu
Ninghao Liu
Mengdi Huai
Lichao Sun
Di Wang
27
5
0
29 Nov 2023
Uncertainty in Additive Feature Attribution methods
Abhishek Madaan
Tanya Chowdhury
Neha Rana
James Allan
Tanmoy Chakraborty
24
0
0
29 Nov 2023
Survey on AI Ethics: A Socio-technical Perspective
Dave Mbiazi
Meghana Bhange
Maryam Babaei
Ivaxi Sheth
Patrik Joslin Kenfack
20
4
0
28 Nov 2023
FocusLearn: Fully-Interpretable, High-Performance Modular Neural Networks for Time Series
Qiqi Su
Christos Kloukinas
Artur dÁvila Garcez
AI4TS
16
3
0
28 Nov 2023
MRxaI: Black-Box Explainability for Image Classifiers in a Medical Setting
Nathan Blake
Hana Chockler
David A. Kelly
Santiago Calderón-Pena
Akchunya Chanchal
15
1
0
24 Nov 2023
On the Relationship Between Interpretability and Explainability in Machine Learning
Benjamin Leblanc
Pascal Germain
FaML
26
0
0
20 Nov 2023
Explainability of Vision Transformers: A Comprehensive Review and New Perspectives
Rojina Kashefi
Leili Barekatain
Mohammad Sabokrou
Fatemeh Aghaeipoor
ViT
37
9
0
12 Nov 2023
SCAAT: Improving Neural Network Interpretability via Saliency Constrained Adaptive Adversarial Training
Rui Xu
Wenkang Qin
Peixiang Huang
Hao Wang
Lin Luo
FAtt
AAML
28
2
0
09 Nov 2023
SmoothHess: ReLU Network Feature Interactions via Stein's Lemma
Max Torop
A. Masoomi
Davin Hill
Kivanc Kose
Stratis Ioannidis
Jennifer Dy
23
4
0
01 Nov 2023
Corrupting Neuron Explanations of Deep Visual Features
Divyansh Srivastava
Tuomas P. Oikarinen
Tsui-Wei Weng
FAtt
AAML
17
2
0
25 Oct 2023
Explanation-based Training with Differentiable Insertion/Deletion Metric-aware Regularizers
Yuya Yoshikawa
Tomoharu Iwata
19
0
0
19 Oct 2023
Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations
Shiyuan Huang
Siddarth Mamidanna
Shreedhar Jangam
Yilun Zhou
Leilani H. Gilpin
LRM
MILM
ELM
37
66
0
17 Oct 2023
A New Baseline Assumption of Integated Gradients Based on Shaply value
Shuyang Liu
Zixuan Chen
Ge Shi
Ji Wang
Changjie Fan
Yu Xiong
Runze Wu Yujing Hu
Ze Ji
Yang Gao
16
3
0
07 Oct 2023
The Blame Problem in Evaluating Local Explanations, and How to Tackle it
Amir Hossein Akhavan Rahnama
ELM
FAtt
30
4
0
05 Oct 2023
SMOOT: Saliency Guided Mask Optimized Online Training
Ali Karkehabadi
Houman Homayoun
Avesta Sasan
AAML
19
17
0
01 Oct 2023
Counterfactual Image Generation for adversarially robust and interpretable Classifiers
Rafael Bischof
F. Scheidegger
Michael A. Kraus
A. Malossi
AAML
30
2
0
01 Oct 2023
Previous
1
2
3
4
5
...
8
9
10
Next