Open-source framework for detecting bias and overfitting for large pathology images

3 March 2025

Abstract

Even foundational models that are trained on datasets with billions of data samples may develop shortcuts that lead to overfitting and bias. Shortcuts are non-relevant patterns in data, such as the background color or color intensity. So, to ensure the robustness of deep learning applications, there is a need for methods to detect and remove such shortcuts. Today's model debugging methods are time consuming since they often require customization to fit for a given model architecture in a specific domain. We propose a generalized, model-agnostic framework to debug deep learning models. We focus on the domain of histopathology, which has very large images that require large models - and therefore large computation resources. It can be run on a workstation with a commodity GPU. We demonstrate that our framework can replicate non-image shortcuts that have been found in previous work for self-supervised learning models, and we also identify possible shortcuts in a foundation model. Our easy to use tests contribute to the development of more reliable, accurate, and generalizable models for WSI analysis. Our framework is available as an open-source tool available on github.

View on arXiv

@article{sildnes2025_2503.01827,
  title={ Open-source framework for detecting bias and overfitting for large pathology images },
  author={ Anders Sildnes and Nikita Shvetsov and Masoud Tafavvoghi and Vi Ngoc-Nha Tran and Kajsa Møllersen and Lill-Tove Rasmussen Busund and Thomas K. Kilvær and Lars Ailo Bongo },
  journal={arXiv preprint arXiv:2503.01827},
  year={ 2025 }
}

Comments on this paper