13

To What Extent Do Token-Level Representations from Pathology Foundation Models Improve Dense Prediction?

Weiming Chen
Xitong Ling
Xidong Wang
Zhenyang Cai
Yijia Guo
Mingxi Fu
Ziyi Zeng
Minxi Ouyang
Jiawen Li
Yizhi Wang
Tian Guan
Benyou Wang
Yonghong He
Main:8 Pages
8 Figures
Bibliography:3 Pages
93 Tables
Appendix:95 Pages
Abstract

Pathology foundation models (PFMs) have rapidly advanced and are becoming a common backbone for downstream clinical tasks, offering strong transferability across tissues and institutions. However, for dense prediction (e.g., segmentation), practical deployment still lacks a clear, reproducible understanding of how different PFMs behave across datasets and how adaptation choices affect performance and stability. We present PFM-DenseBench, a large-scale benchmark for dense pathology prediction, evaluating 17 PFMs across 18 public segmentation datasets. Under a unified protocol, we systematically assess PFMs with multiple adaptation and fine-tuning strategies, and derive insightful, practice-oriented findings on when and why different PFMs and tuning choices succeed or fail across heterogeneous datasets. We release containers, configs, and dataset cards to enable reproducible evaluation and informed PFM selection for real-world dense pathology tasks. Project Website:this https URL

View on arXiv
Comments on this paper