ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.15595
10
27

FsaNet: Frequency Self-attention for Semantic Segmentation

28 November 2022
Fengyu Zhang
Ashkan Panahi
Guangjun Gao
    AI4TS
ArXivPDFHTML
Abstract

Considering the spectral properties of images, we propose a new self-attention mechanism with highly reduced computational complexity, up to a linear rate. To better preserve edges while promoting similarity within objects, we propose individualized processes over different frequency bands. In particular, we study a case where the process is merely over low-frequency components. By ablation study, we show that low frequency self-attention can achieve very close or better performance relative to full frequency even without retraining the network. Accordingly, we design and embed novel plug-and-play modules to the head of a CNN network that we refer to as FsaNet. The frequency self-attention 1) requires only a few low frequency coefficients as input, 2) can be mathematically equivalent to spatial domain self-attention with linear structures, 3) simplifies token mapping (1×11\times11×1 convolution) stage and token mixing stage simultaneously. We show that frequency self-attention requires 87.29%∼90.04%87.29\% \sim 90.04\%87.29%∼90.04% less memory, 96.13%∼98.07%96.13\% \sim 98.07\%96.13%∼98.07% less FLOPs, and 97.56%∼98.18%97.56\% \sim 98.18\%97.56%∼98.18% in run time than the regular self-attention. Compared to other ResNet101-based self-attention networks, \ourM achieves a new \sArt result (83.0%83.0\%83.0% mIoU) on Cityscape test dataset and competitive results on ADE20k and VOCaug. \ourM can also enhance MASK R-CNN for instance segmentation on COCO. In addition, utilizing the proposed module, Segformer can be boosted on a series of models with different scales, and Segformer-B5 can be improved even without retraining. Code is accessible at \url{https://github.com/zfy-csu/FsaNet

View on arXiv
Comments on this paper