It is an open secret that ImageNet is treated as the panacea of pretraining. Particularly in medical machine learning, models not trained from scratch are often finetuned based on ImageNet-pretrained models. We posit that pretraining on data from the domain of the downstream task should almost always be preferred instead. We leverage RadNet-12M, a dataset containing more than 12 million computed tomography (CT) image slices, to explore the efficacy of self-supervised pretraining on medical and natural images. Our experiments cover intra- and cross-domain transfer scenarios, varying data scales, finetuning vs. linear evaluation, and feature space analysis. We observe that intra-domain transfer compares favorably to cross-domain transfer, achieving comparable or improved performance (0.44% - 2.07% performance increase using RadNet pretraining, depending on the experiment) and demonstrate the existence of a domain boundary-related generalization gap and domain-specific learned features.
View on arXiv@article{jonske2025_2306.17555, title={ Why does my medical AI look at pictures of birds? Exploring the efficacy of transfer learning across domain boundaries }, author={ Frederic Jonske and Moon Kim and Enrico Nasca and Janis Evers and Johannes Haubold and René Hosch and Felix Nensa and Michael Kamp and Constantin Seibold and Jan Egger and Jens Kleesiek }, journal={arXiv preprint arXiv:2306.17555}, year={ 2025 } }