253

Laplacian Reconstruction and Refinement for Semantic Segmentation

Abstract

CNN architectures have terrific recognition performance but rely on spatial pooling which makes it difficult to adapt them to tasks that require dense pixel-accurate labeling. This paper makes two contributions: (1) We demonstrate that while the apparent spatial resolution of convolutional feature maps is low, the high-dimensional feature representation contains significant sub-pixel localization information. (2) We describe a multi-resolution reconstruction architecture, akin to a Laplacian pyramid, that uses skip connections from higher resolution feature maps to successively refine segment boundaries reconstructed from lower resolution maps. This approach yields state-of-the-art semantic segmentation results on PASCAL without resorting to more complex CRF or detection driven architectures.

View on arXiv
Comments on this paper