Recent advances in medical vision-language models (VLMs) demonstrate impressive performance in image classification tasks, driven by their strong zero-shot generalization capabilities. However, given the high variability and complexity inherent in medical imaging data, the ability of these models to detect out-of-distribution (OOD) data in this domain remains underexplored. In this work, we conduct the first systematic investigation into the OOD detection potential of medical VLMs. We evaluate state-of-the-art VLM-based OOD detection methods across a diverse set of medical VLMs, including both general and domain-specific purposes. To accurately reflect real-world challenges, we introduce a cross-modality evaluation pipeline for benchmarking full-spectrum OOD detection, rigorously assessing model robustness against both semantic shifts and covariate shifts. Furthermore, we propose a novel hierarchical prompt-based method that significantly enhances OOD detection performance. Extensive experiments are conducted to validate the effectiveness of our approach. The codes are available atthis https URL.
View on arXiv@article{ju2025_2503.01020, title={ Delving into Out-of-Distribution Detection with Medical Vision-Language Models }, author={ Lie Ju and Sijin Zhou and Yukun Zhou and Huimin Lu and Zhuoting Zhu and Pearse A. Keane and Zongyuan Ge }, journal={arXiv preprint arXiv:2503.01020}, year={ 2025 } }