A Simple and Generalist Approach for Panoptic Segmentation

Abstract
Panoptic segmentation is an important computer vision task, where the current state-of-the-art solutions require specialized components to perform well. We propose a simple generalist framework based on a deep encoder - shallow decoder architecture with per-pixel prediction. Essentially fine-tuning a massively pretrained image model with minimal additional components. Naively this method does not yield good results. We show that this is due to imbalance during training and propose a novel method for reducing it - centroid regression in the space of spectral positional embeddings. Our method achieves panoptic quality (PQ) of 55.1 on the challenging MS-COCO dataset, state-of-the-art performance among generalist methods.
View on arXiv@article{prisadnikov2025_2408.16504, title={ A Simple and Generalist Approach for Panoptic Segmentation }, author={ Nedyalko Prisadnikov and Wouter Van Gansbeke and Danda Pani Paudel and Luc Van Gool }, journal={arXiv preprint arXiv:2408.16504}, year={ 2025 } }
Comments on this paper