BiasICL: In-Context Learning and Demographic Biases of Vision Language Models
Vision language models (VLMs) show promise in medical diagnosis, but their performance across demographic subgroups when using in-context learning (ICL) remains poorly understood. We examine how the demographic composition of demonstration examples affects VLM performance in two medical imaging tasks: skin lesion malignancy prediction and pneumothorax detection from chest radiographs. Our analysis reveals that ICL influences model predictions through multiple mechanisms: (1) ICL allows VLMs to learn subgroup-specific disease base rates from prompts and (2) ICL leads VLMs to make predictions that perform differently across demographic groups, even after controlling for subgroup-specific disease base rates. Our empirical results inform best-practices for prompting current VLMs (specifically examining demographic subgroup performance, and matching base rates of labels to target distribution at a bulk level and within subgroups), while also suggesting next steps for improving our theoretical understanding of these models.
View on arXiv@article{xu2025_2503.02334, title={ BiasICL: In-Context Learning and Demographic Biases of Vision Language Models }, author={ Sonnet Xu and Joseph Janizek and Yixing Jiang and Roxana Daneshjou }, journal={arXiv preprint arXiv:2503.02334}, year={ 2025 } }