ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.19820
45
2

GaLore+++: Boosting Low-Rank Adaptation for LLMs with Cross-Head Projection

31 December 2024
Xutao Liao
Shaohui Li
Yuhui Xu
Zhi Li
Y. Liu
You He
    VLM
ArXivPDFHTML
Abstract

Recent low-rank training methods, such as GaLore, have significantly reduced the memory required to optimize large language models (LLMs). However, these methods often suffer from time-consuming low-rank projection estimations. In particular, the singular value decomposition (SVD) in GaLore can consume more than 80\% of the total training time. To address this issue, we propose GaLore+++, which uses cross-head low-rank projection to reduce the substantial time consumption in estimating low-rank projections for multi-head attention. In addition, we employ randomized subspace iteration to achieve fast SVD. To further enhance performance, we propose sparsely coded residuals to reduce the errors caused by low-rank approximation on the first- and second-order moments of the optimizers and weight updates. We evaluate GaLore+++ on arithmetic reasoning and natural language generation datasets. Our experiments demonstrate that GaLore+++ delivers superior performance while achieving approximately 4×4\times4× fine-tuning speed compared to vanilla GaLore.

View on arXiv
Comments on this paper