Papers citing 'Unmasking Clever Hans Predictors and Assessing What Machines Really Learn'

Title
Human Cognitive Biases in Explanation-Based Interaction: The Case of Within and Between Session Order Effect Dario Pesenti Alessandro Bogani Katya Tentori Stefano Teso 20 0 0 04 Dec 2025
Attention Trajectories as a Diagnostic Axis for Deep Reinforcement Learning Charlotte Beylier Hannah Selder Arthur Fleig S. M. Hofmann Nico Scherf 97 0 0 25 Nov 2025
Learning to Seek Evidence: A Verifiable Reasoning Agent with Causal Faithfulness Analysis Yuhang Huang Zekai Lin Fan Zhong Lei Liu CML LRM 140 0 0 03 Nov 2025
Imbalanced Classification through the Lens of Spurious Correlations Jakob Hackstein Sidney Bender 120 0 0 31 Oct 2025
Mitigating Clever Hans Strategies in Image Classifiers through Generating Counterexamples Sidney Bender Ole Delzer J. Herrmann Heike Marxfeld Klaus-Robert Müller G. Montavon 207 1 0 20 Oct 2025
Circuit Insights: Towards Interpretability Beyond Activations Elena Golimblevskaia Aakriti Jain Bruno Puri Ammar Ibrahim Wojciech Samek Sebastian Lapuschkin FAtt 207 0 0 16 Oct 2025
LeapFactual: Reliable Visual Counterfactual Explanation Using Conditional Flow Matching Zhuo Cao Xuan Zhao Lena Krieger Hanno Scharr Ira Assent OOD 269 0 0 16 Oct 2025
o-MEGA: Optimized Methods for Explanation Generation and Analysis Ľuboš Kriš Jaroslav Kopčan Qiwei Peng Andrej Ridzik Marcel Veselý Martin Tamajka 146 0 0 30 Sep 2025
An Experimental Study on Generating Plausible Textual Explanations for Video Summarization Thomas Eleftheriadis Evlampios Apostolidis Vasileios Mezaris 82 0 0 30 Sep 2025
TDHook: A Lightweight Framework for Interpretability Yoann Poupart AI4CE 124 0 0 29 Sep 2025
Explaining multimodal LLMs via intra-modal token interactions Jiawei Liang Ruoyu Chen Xianghao Jiao Siyuan Liang Shiming Liu Qunli Zhang Zheng Hu Xiaochun Cao LRM 165 0 0 26 Sep 2025
Value bounds and Convergence Analysis for Averages of LRP attributions Alexander Binder Nastaran Takmil-Homayouni Ürün Dogan FAtt 228 0 0 10 Sep 2025
Model Science: getting serious about verification, explanation and control of AI systems Przemyslaw Biecek Wojciech Samek 120 0 0 27 Aug 2025
How can we trust opaque systems? Criteria for robust explanations in XAI Florian J. Boge Annika Schuster AAML 100 0 0 18 Aug 2025
AIM: Amending Inherent Interpretability via Self-Supervised Masking Eyad Alshami Shashank Agnihotri Bernt Schiele Margret Keuper AAML 143 1 0 15 Aug 2025
SIDE: Sparse Information Disentanglement for Explainable Artificial Intelligence Viktar Dubovik Łukasz Struski Jacek Tabor Dawid Rymarczyk 223 0 0 25 Jul 2025
MeDi: Metadata-Guided Diffusion Models for Mitigating Biases in Tumor ClassificationInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025 David Jacob Drexlin Jonas Dippel Julius Hense Niklas Prenißl G. Montavon Frederick Klauschen Klaus-Robert Müller DiffM MedIm 130 2 0 20 Jun 2025
TRUST: Transparent, Robust and Ultra-Sparse Trees Albert Dorador 100 2 0 18 Jun 2025
EvolvTrip: Enhancing Literary Character Understanding with Temporal Theory-of-Mind Graphs Bohao Yang Hainiu Xu Jinhua Du Ze Li Petr Slovak Chenghua Lin 153 0 0 16 Jun 2025
PiPViT: Patch-based Visual Interpretable Prototypes for Retinal Image Analysis Marzieh Oghbaie Teresa Araújoa Hrvoje Bogunović ViT MedIm 340 0 0 12 Jun 2025
Identifying Alzheimer's Disease Prediction Strategies of Convolutional Neural Network Classifiers using R2* Maps and Spectral Clustering C. Tinauer Maximilian Sackl Stefan Ropele C. Langkammer 201 1 0 04 Jun 2025
Wasserstein Distances Made Explainable: Insights into Dataset Shifts and Transport Phenomena Philip Naumann Jacob R. Kauffmann G. Montavon 258 0 0 09 May 2025
What Do Large Language Models Know? Tacit Knowledge as a Potential Causal-Explanatory StructurePhilosophia Scientiæ (PS), 2025 Céline Budding 136 3 0 16 Apr 2025
Uncovering the Structure of Explanation Quality with Spectral Analysis Johannes Maeß G. Montavon Shinichi Nakajima Klaus-Robert Müller Thomas Schnake FAtt 323 0 0 11 Apr 2025
Cat, Rat, Meow: On the Alignment of Language Model and Human Term-Similarity Judgments Lorenz Linhardt Tom Neuhäuser Lenka Tětková Oliver Eberle ALM AI4TS 202 2 0 10 Apr 2025
Diffusion Counterfactuals for Image Regressors Trung Duc Ha Sidney Bender DiffM 360 2 0 26 Mar 2025
Automated Processing of eXplainable Artificial Intelligence Outputs in Deep Learning Models for Fault Diagnostics of Large InfrastructuresEngineering applications of artificial intelligence (EAAI), 2025 Giovanni Floreale Piero Baraldi Enrico Zio Olga Fink 212 2 0 19 Mar 2025
Unifying Perplexing Behaviors in Modified BP Attributions through Alignment Perspective Guanhua Zheng Jitao Sang Changsheng Xu AAML FAtt 300 0 0 14 Mar 2025
Interactive Medical Image Analysis with Concept-based Similarity ReasoningComputer Vision and Pattern Recognition (CVPR), 2025 Ta Duc Huy Sen Kim Tran Phan Nguyen Nguyen Hoang Tran Tran Bao Sam Anton Van Den Hengel Zhibin Liao Johan Verjans Minh-Son To Vu Minh Hieu Phan 312 7 0 10 Mar 2025
Machine Learning in Biomechanics: Key Applications and Limitations in Walking, Running, and Sports Movements Carlo Dindorf Fabian Horst D. Slijepcevic Bernhard Dumphart Jonas Dully Matthias Zeppelzauer B. Horsak Michael Fröhlich 202 5 0 05 Mar 2025
Do ImageNet-trained models learn shortcuts? The impact of frequency shortcuts on generalizationComputer Vision and Pattern Recognition (CVPR), 2025 Shunxin Wang Raymond N. J. Veldhuis N. Strisciuglio VLM 464 1 0 05 Mar 2025
A Close Look at Decomposition-based XAI-Methods for Transformer Language Models L. Arras Bruno Puri Patrick Kahardipraja Sebastian Lapuschkin Wojciech Samek 292 4 0 21 Feb 2025
B-cos LM: Efficiently Transforming Pre-trained Language Models for Improved Explainability Yifan Wang Sukrut Rao Ji-Ung Lee Mayank Jobanputra Vera Demberg 227 0 0 18 Feb 2025
The Cake that is Intelligence and Who Gets to Bake it: An AI Analogy and its Implications for Participation Martin Mundt Anaelia Ovalle Felix Friedrich A Pranav Subarnaduti Paul Manuel Brack Kristian Kersting William Agnew 1.3K 1 0 05 Feb 2025
P-TAME: Explain Any Image Classifier with Trained PerturbationsIEEE Open Journal of Signal Processing (JOSP), 2025 Mariano V. Ntrougkas Vasileios Mezaris Ioannis Patras AAML FAtt 232 0 0 29 Jan 2025
Skull-stripping induces shortcut learning in MRI-based Alzheimer's disease classification C. Tinauer Maximilian Sackl Rudolf Stollberger Reinhold Schmidt Stefan Ropele C. Langkammer AAML 246 1 0 27 Jan 2025
Mechanistic understanding and validation of large AI models with SemanticLens Maximilian Dreyer J. Berend Tobias Labarta Johanna Vielhaben Thomas Wiegand Sebastian Lapuschkin Wojciech Samek 199 23 0 10 Jan 2025
Predictable Artificial Intelligence Lexin Zhou Pablo Antonio Moreno Casares Fernando Martínez-Plumed John Burden Ryan Burnell ... Seán Ó hÉigeartaigh Danaja Rutar Wout Schellaert Konstantinos Voudouris José Hernández-Orallo 481 8 0 08 Jan 2025
xMIL: Insightful Explanations for Multiple Instance Learning in HistopathologyNeural Information Processing Systems (NeurIPS), 2024 Julius Hense M. J. Idaji Oliver Eberle Thomas Schnake Jonas Dippel Laure Ciernik Oliver Buchstab Andreas Mock Frederick Klauschen Klaus-Robert Müller 247 8 0 08 Jan 2025
Neural network interpretability with layer-wise relevance propagation: novel techniques for neuron selection and visualizationComputing and Communication Workshop and Conference (CC), 2024 Deepshikha Bhati Fnu Neha Md. Amiruzzaman Angela Guercio Deepak Kumar Shukla Ben Ward FAtt 369 2 0 07 Dec 2024
Explaining the Impact of Training on Vision Models via Activation Clustering Ahcène Boubekki Samuel G. Fadel Sebastian Mair 697 1 0 29 Nov 2024
Aligning Generalisation Between Humans and Machines Filip Ilievski Barbara Hammer F. V. Harmelen Benjamin Paassen S. Saralajew ... Vered Shwartz Gabriella Skitalinskaya Clemens Stachl Gido M. van de Ven T. Villmann 697 5 0 23 Nov 2024
Shortcut Learning in In-Context Learning: A Survey Rui Song Yingji Li Fausto Giunchiglia Fausto Giunchiglia Hao Xu 386 4 0 04 Nov 2024
Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers Lam Nguyen Tung Steven Cho Xiaoning Du Neelofar Neelofar Valerio Terragni Stefano Ruberto Aldeida Aleti 1.2K 3 0 30 Oct 2024
Improving Image Data Leakage Detection in Automotive Software Md Abu Ahammed Babu Sushant Kumar Pandey Darko Durisic Ashok Chaitanya Koppisetty Miroslaw Staron 186 1 0 29 Oct 2024
Study on the Helpfulness of Explainable Artificial Intelligence Tobias Labarta Elizaveta Kulicheva Ronja Froelian Christian Geißler Xenia Melman Julian von Klitzing ELM 215 6 0 14 Oct 2024
Self-eXplainable AI for Medical Image Analysis: A Survey and New Outlooks Junlin Hou Sicen Liu Yequan Bie Hongmei Wang Andong Tan Luyang Luo Hao Chen XAI 352 26 0 03 Oct 2024
Facing Asymmetry -- Uncovering the Causal Link between Facial Symmetry and Expression Classifiers using Synthetic InterventionsAsian Conference on Computer Vision (ACCV), 2024 Tim Buchner Niklas Penzel Orlando Guntinas-Lichius Joachim Denzler CVBM 312 3 0 24 Sep 2024
Explainable AI needs formalization Stefan Haufe Rick Wilming Benedict Clark Rustam Zhumagambetov Danny Panknin Ahcène Boubekki Danny Panknin XAI 412 4 0 22 Sep 2024
Multi-Scale Grouped Prototypes for Interpretable Semantic SegmentationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024 Hugo Porta Emanuele Dalsasso Diego Marcos D. Tuia 549 1 0 14 Sep 2024