Immunocto: a massive immune cell database auto-generated for histopathology

With the advent of novel cancer treatment options such as immunotherapy, studying the tumour immune micro-environment (TIME) is crucial to inform on prognosis and understand potential response to therapeutic agents. A key approach to characterising the TIME may be through combining (1) digitised microscopic high-resolution optical images of hematoxylin and eosin (H&E) stained tissue sections obtained in routine histopathology examinations with (2) automated immune cell detection and classification methods. In this work, we introduce a workflow to automatically generate robust single cell contours and labels from dually stained tissue sections with H&E and multiplexed immunofluorescence (IF) markers. The approach harnesses the Segment Anything Model and requires minimal human intervention compared to existing single cell databases. With this methodology, we create Immunocto, a massive, multi-million automatically generated database of 6,848,454 human cells and objects, including 2,282,818 immune cells distributed across 4 subtypes: CD4 T cell lymphocytes, CD8 T cell lymphocytes, CD20 B cell lymphocytes, and CD68/CD163 macrophages. For each cell, we provide a 6464 pixels H&E image at magnification, along with a binary mask of the nucleus and a label. The database, which is made publicly available, can be used to train models to study the TIME on routine H&E slides. We show that deep learning models trained on Immunocto result in state-of-the-art performance for lymphocyte detection. The approach demonstrates the benefits of using matched H&E and IF data to generate robust databases for computational pathology applications.
View on arXiv@article{simard2025_2406.02618, title={ Immunocto: a massive immune cell database auto-generated for histopathology }, author={ Mikaël Simard and Zhuoyan Shen and Konstantin Bräutigam and Rasha Abu-Eid and Maria A. Hawkins and Charles-Antoine Collins-Fekete }, journal={arXiv preprint arXiv:2406.02618}, year={ 2025 } }