ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.16633
25
2

MLAAN: Scaling Supervised Local Learning with Multilaminar Leap Augmented Auxiliary Network

24 June 2024
Yuming Zhang
Shouxin Zhang
Peizhe Wang
Feiyu Zhu
Dongzhi Guan
Junhao Su
Jiabin Liu
Changpeng Cai
ArXivPDFHTML
Abstract

Deep neural networks (DNNs) typically employ an end-to-end (E2E) training paradigm which presents several challenges, including high GPU memory consumption, inefficiency, and difficulties in model parallelization during training. Recent research has sought to address these issues, with one promising approach being local learning. This method involves partitioning the backbone network into gradient-isolated modules and manually designing auxiliary networks to train these local modules. Existing methods often neglect the interaction of information between local modules, leading to myopic issues and a performance gap compared to E2E training. To address these limitations, we propose the Multilaminar Leap Augmented Auxiliary Network (MLAAN). Specifically, MLAAN comprises Multilaminar Local Modules (MLM) and Leap Augmented Modules (LAM). MLM captures both local and global features through independent and cascaded auxiliary networks, alleviating performance issues caused by insufficient global features. However, overly simplistic auxiliary networks can impede MLM's ability to capture global information. To address this, we further design LAM, an enhanced auxiliary network that uses the Exponential Moving Average (EMA) method to facilitate information exchange between local modules, thereby mitigating the shortsightedness resulting from inadequate interaction. The synergy between MLM and LAM has demonstrated excellent performance. Our experiments on the CIFAR-10, STL-10, SVHN, and ImageNet datasets show that MLAAN can be seamlessly integrated into existing local learning frameworks, significantly enhancing their performance and even surpassing end-to-end (E2E) training methods, while also reducing GPU memory consumption.

View on arXiv
Comments on this paper