ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.19628
26
0

RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations

27 December 2024
Mingshu Zhao
Yi Luo
Yong Ouyang
ArXivPDFHTML
Abstract

Recent advances in vision transformers (ViTs) have demonstrated the advantage of global modeling capabilities, prompting widespread integration of large-kernel convolutions for enlarging the effective receptive field (ERF). However, the quadratic scaling of parameter count and computational complexity (FLOPs) with respect to kernel size poses significant efficiency and optimization challenges. This paper introduces RecConv, a recursive decomposition strategy that efficiently constructs multi-frequency representations using small-kernel convolutions. RecConv establishes a linear relationship between parameter growth and decomposing levels which determines the effective kernel size k×2ℓk\times 2^\ellk×2ℓ for a base kernel kkk and ℓ\ellℓ levels of decomposition, while maintaining constant FLOPs regardless of the ERF expansion. Specifically, RecConv achieves a parameter expansion of only ℓ+2\ell+2ℓ+2 times and a maximum FLOPs increase of 5/35/35/3 times, compared to the exponential growth (4ℓ4^\ell4ℓ) of standard and depthwise convolutions. RecNeXt-M3 outperforms RepViT-M1.1 by 1.9 APboxAP^{box}APbox on COCO with similar FLOPs. This innovation provides a promising avenue towards designing efficient and compact networks across various modalities. Codes and models can be found at \url{https://github.com/suous/RecNeXt}.

View on arXiv
Comments on this paper