MadCLIP: Few-shot Medical Anomaly Detection with CLIP
- MedImVLM

An innovative few-shot anomaly detection approach is presented, leveraging the pre-trained CLIP model for medical data, and adapting it for both image-level anomaly classification (AC) and pixel-level anomaly segmentation (AS). A dual-branch design is proposed to separately capture normal and abnormal features through learnable adapters in the CLIP vision encoder. To improve semantic alignment, learnable text prompts are employed to link visual features. Furthermore, SigLIP loss is applied to effectively handle the many-to-one relationship between images and unpaired text prompts, showcasing its adaptation in the medical field for the first time. Our approach is validated on multiple modalities, demonstrating superior performance over existing methods for AC and AS, in both same-dataset and cross-dataset evaluations. Unlike prior work, it does not rely on synthetic data or memory banks, and an ablation study confirms the contribution of each component. The code is available atthis https URL.
View on arXiv@article{shiri2025_2506.23810, title={ MadCLIP: Few-shot Medical Anomaly Detection with CLIP }, author={ Mahshid Shiri and Cigdem Beyan and Vittorio Murino }, journal={arXiv preprint arXiv:2506.23810}, year={ 2025 } }