Laplacian Reconstruction and Refinement for Semantic Segmentation
- SSeg
CNN architectures have terrific recognition performance but rely on spatial pooling which makes it difficult to adapt them to tasks that require dense pixel-accurate labeling. This paper makes two contributions: (1) We demonstrate that while the apparent spatial resolution of convolutional feature maps is low, the high-dimensional feature representation contains significant sub-pixel localization information. (2) We describe a multi-resolution reconstruction architecture, akin to a Laplacian pyramid, that uses skip connections from higher resolution feature maps to successively refine segment boundaries reconstructed from lower resolution maps. This approach yields state-of-the-art semantic segmentation results on PASCAL without resorting to more complex CRF or detection driven architectures.
View on arXiv