v1v2v3v4 (latest)

Caption-Driven Explainability: Probing CNNs for Bias via CLIP

24 October 2025

Patrick Koller

Amil Dravid

Guido M. Schuster

Aggelos K. Katsaggelos

ArXiv (abs)PDF HTML

Main:4 Pages

5 Figures

Bibliography:1 Pages

Abstract

Robustness has become one of the most critical problems in machine learning (ML). The science of interpreting ML models to understand their behavior and improve their robustness is referred to as explainable artificial intelligence (XAI). One of the state-of-the-art XAI methods for computer vision problems is to generate saliency maps. A saliency map highlights the pixel space of an image that excites the ML model the most. However, this property could be misleading if spurious and salient features are present in overlapping pixel spaces. In this paper, we propose a caption-based XAI method, which integrates a standalone model to be explained into the contrastive language-image pre-training (CLIP) model using a novel network surgery approach. The resulting caption-based XAI model identifies the dominant concept that contributes the most to the models prediction. This explanation minimizes the risk of the standalone model falling for a covariate shift and contributes significantly towards developing robust ML models. Our code is available atthis https URL

View on arXiv

Comments on this paper