Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2011.05429
Cited By
Debugging Tests for Model Explanations
10 November 2020
Julius Adebayo
M. Muelly
Ilaria Liccardi
Been Kim
FAtt
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Debugging Tests for Model Explanations"
50 / 135 papers shown
The Impact of Concept Explanations and Interventions on Human-Machine Collaboration
Jack Furby
Dan Cunnington
Dave Braines
Alun D. Preece
146
0
0
19 Oct 2025
Faithful and Interpretable Explanations for Complex Ensemble Time Series Forecasts using Surrogate Models and Forecastability Analysis
Yikai Zhao
Jiekai Ma
AI4TS
195
3
0
09 Oct 2025
DeepProv: Behavioral Characterization and Repair of Neural Networks via Inference Provenance Graph Analysis
Firas Ben Hmida
Abderrahmen Amich
Ata Kaboudi
Birhanu Eshete
AAML
GNN
235
0
0
30 Sep 2025
Targeted Fine-Tuning of DNN-Based Receivers via Influence Functions
Marko Tuononen
Heikki Penttinen
Ville Hautamäki
259
0
0
19 Sep 2025
Informative Post-Hoc Explanations Only Exist for Simple Functions
Eric Günther
Balázs Szabados
Robi Bhattacharjee
Sebastian Bordt
U. V. Luxburg
FAtt
232
3
0
15 Aug 2025
SafeFix: Targeted Model Repair via Controlled Image Generation
Ouyang Xu
Baoming Zhang
Ruiyu Mao
Yunhui Guo
219
0
0
12 Aug 2025
On the Performance of Concept Probing: The Influence of the Data (Extended Version)
Manuel de Sousa Ribeiro
Afonso Leote
João Leite
297
1
0
24 Jul 2025
Concept Probing: Where to Find Human-Defined Concepts (Extended Version)
Manuel de Sousa Ribeiro
Afonso Leote
João Leite
306
1
0
24 Jul 2025
Benchmarking Time-localized Explanations for Audio Classification Models
Cecilia Bolaños
L. Pepino
Martin Meza
Luciana Ferrer
335
2
0
04 Jun 2025
Out-of-Distribution Detection via Channelwise Feature Aggregation in Neural Network-Based Receivers
Marko Tuononen
Duy Vu
Dani Korpi
Vesa Starck
Ville Hautamäki
Ville Hautamäki
518
1
0
21 May 2025
How to Achieve Higher Accuracy with Less Training Points?
Jinghan Yang
Anupam Pani
Yunchao Zhang
281
1
0
18 Apr 2025
Fast Fourier Correlation is a Highly Efficient and Accurate Feature Attribution Algorithm from the Perspective of Control Theory and Game Theory
Zechen Liu
Feiyang Zhang
Wei Song
Xuelong Li
Wei Wei
FAtt
483
0
0
02 Apr 2025
Uncertainty Propagation in XAI: A Comparison of Analytical and Empirical Estimators
Teodor Chiaburu
Felix Bießmann
Frank Haußer
330
5
0
01 Apr 2025
Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry
Sai Sumedh R. Hindupur
Ekdeep Singh Lubana
Thomas Fel
Demba Ba
393
38
0
03 Mar 2025
Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models
Thomas Fel
Ekdeep Singh Lubana
Jacob S. Prince
M. Kowal
Victor Boutin
Isabel Papadimitriou
Binxu Wang
Martin Wattenberg
Demba Ba
Talia Konkle
393
40
0
18 Feb 2025
Navigating the Maze of Explainable AI: A Systematic Approach to Evaluating Methods and Metrics
Neural Information Processing Systems (NeurIPS), 2024
Lukas Klein
Carsten T. Lüth
U. Schlegel
Till J. Bungert
Mennatallah El-Assady
Paul F. Jäger
XAI
ELM
743
22
0
03 Jan 2025
Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts
Jihye Choi
Jayaram Raghuram
Shouqing Yang
Somesh Jha
346
9
0
18 Dec 2024
Lost in Context: The Influence of Context on Feature Attribution Methods for Object Recognition
Indian Conference on Computer Vision, Graphics & Image Processing (ICVGIP), 2024
Sayanta Adhikari
Rishav Kumar
Konda Reddy Mopuri
Rajalakshmi Pachamuthu
274
1
0
05 Nov 2024
Feature Responsiveness Scores: Model-Agnostic Explanations for Recourse
International Conference on Learning Representations (ICLR), 2024
Seung Hyun Cheon
Anneke Wernerfelt
Sorelle A. Friedler
Berk Ustun
FaML
FAtt
684
8
0
29 Oct 2024
Fool Me Once? Contrasting Textual and Visual Explanations in a Clinical Decision-Support Setting
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Maxime Kayser
Bayar I. Menzat
Cornelius Emde
Bogdan Bercean
Alex Novak
Abdala Espinosa
B. Papież
Susanne Gaube
Thomas Lukasiewicz
Oana-Maria Camburu
486
11
0
16 Oct 2024
AdaptGrad: Adaptive Sampling to Reduce Noise
Linjiang Zhou
Chao Ma
Zepeng Wang
Libing Wu
Xiaochuan Shi
FAtt
431
2
0
10 Oct 2024
Joint Universal Adversarial Perturbations with Interpretations
Liang-bo Ning
Zeyu Dai
Wenqi Fan
Jingran Su
Chao Pan
Luning Wang
Qing Li
AAML
366
4
0
03 Aug 2024
Comprehensive Attribution: Inherently Explainable Vision Model with Feature Detector
European Conference on Computer Vision (ECCV), 2024
Xianren Zhang
Dongwon Lee
Suhang Wang
VLM
FAtt
302
5
0
27 Jul 2024
DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks
Sarah Jabbour
Gregory Kondas
Ella Kazerooni
Michael Sjoding
David Fouhey
Jenna Wiens
FAtt
DiffM
286
2
0
19 Jul 2024
Geometric Remove-and-Retrain (GOAR): Coordinate-Invariant eXplainable AI Assessment
Yong-Hyun Park
Junghoon Seo
Bomseok Park
Seongsu Lee
Junghyo Jo
AAML
400
5
0
17 Jul 2024
Benchmarking the Attribution Quality of Vision Models
Robin Hesse
Simone Schaub-Meyer
Stefan Roth
FAtt
381
5
0
16 Jul 2024
Regulating Model Reliance on Non-Robust Features by Smoothing Input Marginal Density
Peiyu Yang
Naveed Akhtar
Mubarak Shah
Lin Wang
AAML
279
6
0
05 Jul 2024
Axiomatization of Gradient Smoothing in Neural Networks
Linjiang Zhou
Xiaochuan Shi
Chao Ma
Zepeng Wang
FAtt
265
1
0
29 Jun 2024
Provably Better Explanations with Optimized Aggregation of Feature Attributions
International Conference on Machine Learning (ICML), 2024
Thomas Decker
Ananta R. Bhattarai
Jindong Gu
Volker Tresp
Florian Buettner
242
7
0
07 Jun 2024
Allowing humans to interactively guide machines where to look does not always improve human-AI team's classification accuracy
Giang Nguyen
Mohammad Reza Taesiri
Sunnie S. Y. Kim
Anh Totti Nguyen
HAI
AAML
FAtt
536
8
0
08 Apr 2024
How explainable AI affects human performance: A systematic review of the behavioural consequences of saliency maps
International journal of human computer interactions (IJHCI), 2024
Romy Müller
HAI
286
26
0
03 Apr 2024
What Sketch Explainability Really Means for Downstream Tasks
Computer Vision and Pattern Recognition (CVPR), 2024
Hmrishav Bandyopadhyay
Pinaki Nath Chowdhury
A. Bhunia
Aneeshan Sain
Tao Xiang
Yi-Zhe Song
387
8
0
14 Mar 2024
Fine-Tuning Text-To-Image Diffusion Models for Class-Wise Spurious Feature Generation
AprilPyone Maungmaung
H. Nguyen
Hitoshi Kiya
Isao Echizen
247
7
0
13 Feb 2024
Black-Box Access is Insufficient for Rigorous AI Audits
Conference on Fairness, Accountability and Transparency (FAccT), 2024
Stephen Casper
Carson Ezell
Charlotte Siegmann
Noam Kolt
Taylor Lynn Curtis
...
Michael Gerovitch
David Bau
Max Tegmark
David M. Krueger
Dylan Hadfield-Menell
AAML
720
153
0
25 Jan 2024
Fast Diffusion-Based Counterfactuals for Shortcut Removal and Generation
Nina Weng
Paraskevas Pegios
Eike Petersen
Aasa Feragen
Siavash Bigdeli
MedIm
CML
442
33
0
21 Dec 2023
Evaluating the Utility of Model Explanations for Model Development
Shawn Im
Jacob Andreas
Yilun Zhou
XAI
FAtt
ELM
360
2
0
10 Dec 2023
DiG-IN: Diffusion Guidance for Investigating Networks -- Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations
Computer Vision and Pattern Recognition (CVPR), 2023
Maximilian Augustin
Yannic Neuhaus
Matthias Hein
DiffM
416
10
0
29 Nov 2023
On the Feasibility of Reasoning about the Internal States of Blackbox IoT Devices Using Side-Channel Information
Wei Sun
Yuwei Xiao
Haojian Jin
Dinesh Bharadia
144
1
0
23 Nov 2023
(Why) Is My Prompt Getting Worse? Rethinking Regression Testing for Evolving LLM APIs
Wanqin Ma
Chenyang Yang
Jane Hsieh
222
30
0
18 Nov 2023
SCAAT: Improving Neural Network Interpretability via Saliency Constrained Adaptive Adversarial Training
Rui Xu
Wenkang Qin
Peixiang Huang
Hao Wang
Lin Luo
FAtt
AAML
337
3
0
09 Nov 2023
Detecting Spurious Correlations via Robust Visual Concepts in Real and AI-Generated Image Classification
Preetam Prabhu Srikar Dammu
Chirag Shah
267
2
0
03 Nov 2023
How Well Do Feature-Additive Explainers Explain Feature-Additive Predictors?
Zachariah Carmichael
Walter J. Scheirer
FAtt
306
9
0
27 Oct 2023
Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations
Shiyuan Huang
Siddarth Mamidanna
Shreedhar Jangam
Yilun Zhou
Leilani H. Gilpin
LRM
MILM
ELM
546
115
0
17 Oct 2023
Transparent Anomaly Detection via Concept-based Explanations
Laya Rafiee Sevyeri
Ivaxi Sheth
Farhood Farahnak
Samira Ebrahimi Kahou
S. Enger
339
6
0
16 Oct 2023
NeuroInspect: Interpretable Neuron-based Debugging Framework through Class-conditional Visualizations
Yeong-Joon Ju
Ji-Hoon Park
Seong-Whan Lee
AAML
264
0
0
11 Oct 2023
May I Ask a Follow-up Question? Understanding the Benefits of Conversations in Neural Network Explainability
International journal of human computer interactions (IJHCI), 2023
Tong Zhang
Xiaoyu Yang
Boyang Albert Li
335
7
0
25 Sep 2023
COSE: A Consistency-Sensitivity Metric for Saliency on Image Classification
Rangel Daroya
Aaron Sun
Subhransu Maji
227
1
0
20 Sep 2023
Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing
Conference on Computer and Communications Security (CCS), 2023
Jinwen He
Kai Chen
Guozhu Meng
Jiangshan Zhang
Congyi Li
FAtt
AAML
345
5
0
09 Sep 2023
FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods
IEEE International Conference on Computer Vision (ICCV), 2023
Robin Hesse
Simone Schaub-Meyer
Stefan Roth
AAML
332
50
0
11 Aug 2023
Right for the Wrong Reason: Can Interpretable ML Techniques Detect Spurious Correlations?
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2023
Susu Sun
Lisa M. Koch
Christian F. Baumgartner
352
21
0
23 Jul 2023
1
2
3
Next
Page 1 of 3