Nucleus detection in histopathology whole slide images (WSIs) is crucial for a broad spectrum of clinical applications. Current approaches for nucleus detection in gigapixel WSIs utilize a sliding window methodology, which overlooks boarder contextual information (eg, tissue structure) and easily leads to inaccurate predictions. To address this problem, recent studies additionally crops a large Filed-of-View (FoV) region around each sliding window to extract contextual features. However, such methods substantially increases the inference latency. In this paper, we propose an effective and efficient context-aware nucleus detection algorithm. Specifically, instead of leveraging large FoV regions, we aggregate contextual clues from off-the-shelf features of historically visited sliding windows. This design greatly reduces computational overhead. Moreover, compared to large FoV regions at a low magnification, the sliding window patches have higher magnification and provide finer-grained tissue details, thereby enhancing the detection accuracy. To further improve the efficiency, we propose a grid pooling technique to compress dense feature maps of each patch into a few contextual tokens. Finally, we craft OCELOT-seg, the first benchmark dedicated to context-aware nucleus instance segmentation. Code, dataset, and model checkpoints will be available atthis https URL.
View on arXiv@article{shui2025_2503.05678, title={ Towards Effective and Efficient Context-aware Nucleus Detection in Histopathology Whole Slide Images }, author={ Zhongyi Shui and Ruizhe Guo and Honglin Li and Yuxuan Sun and Yunlong Zhang and Chenglu Zhu and Jiatong Cai and Pingyi Chen and Yanzhou Su and Lin Yang }, journal={arXiv preprint arXiv:2503.05678}, year={ 2025 } }