ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.03507
  4. Cited By
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
v1v2 (latest)

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

6 March 2024
Jiawei Zhao
Zhenyu Zhang
Beidi Chen
Zinan Lin
A. Anandkumar
Yuandong Tian
ArXiv (abs)PDFHTMLHuggingFace (189 upvotes)

Papers citing "GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection"

50 / 219 papers shown
Surgical AI Copilot: Energy-Based Fourier Gradient Low-Rank Adaptation for Surgical LLM Agent Reasoning and Planning
Surgical AI Copilot: Energy-Based Fourier Gradient Low-Rank Adaptation for Surgical LLM Agent Reasoning and Planning
Jiayuan Huang
Runlong He
Danyal Z. Khan
E. Mazomenos
Danail Stoyanov
Hani J. Marcus
Matthew J. Clarkson
Mobarakol Islam
Mobarak I. Hoque
LM&Ro
318
0
0
12 Mar 2025
WikiBigEdit: Understanding the Limits of Lifelong Knowledge Editing in LLMs
WikiBigEdit: Understanding the Limits of Lifelong Knowledge Editing in LLMs
Lukas Thede
Karsten Roth
Matthias Bethge
Zeynep Akata
Tom Hartvigsen
KELMCLL
364
4
0
07 Mar 2025
Efficient Jailbreaking of Large Models by Freeze Training: Lower Layers Exhibit Greater Sensitivity to Harmful Content
Efficient Jailbreaking of Large Models by Freeze Training: Lower Layers Exhibit Greater Sensitivity to Harmful Content
Hongyuan Shen
Min Zheng
Jincheng Wang
Yang Zhao
234
0
0
28 Feb 2025
LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM
LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM
Yehonathan Refael
Iftach Arbel
Ofir Lindenbaum
Tom Tirer
426
2
0
26 Feb 2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
Jinbo Wang
Mingze Wang
Zhanpeng Zhou
Junchi Yan
Weinan E
Lei Wu
466
8
0
26 Feb 2025
Kanana: Compute-efficient Bilingual Language Models
Kanana: Compute-efficient Bilingual Language Models
Kanana LLM Team
Yunju Bak
Hojin Lee
Minho Ryu
Jiyeon Ham
...
Daniel Lee
Minchul Lee
MinHyung Lee
Shinbok Lee
Gaeun Seo
368
13
0
26 Feb 2025
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Chenghao Fan
Zhenyi Lu
Sichen Liu
Xiaoye Qu
Xiaoye Qu
Wei Wei
Yu Cheng
MoE
1.1K
9
0
24 Feb 2025
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
Liming Liu
Zhenghao Xu
Zixuan Zhang
Hao Kang
Zichong Li
Chen Liang
Weizhu Chen
T. Zhao
1.0K
12
0
24 Feb 2025
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam
Tianjin Huang
Haotian Hu
Zhenyu Zhang
Gaojie Jin
Xianrui Li
...
Tianlong Chen
Lu Liu
Qingsong Wen
Zhangyang Wang
Shiwei Liu
MQ
358
6
0
24 Feb 2025
Enhancing Adversarial Robustness of Vision-Language Models through Low-Rank Adaptation
Enhancing Adversarial Robustness of Vision-Language Models through Low-Rank AdaptationInternational Conference on Multimedia Retrieval (ICMR), 2024
Yuheng Ji
Yue Liu
Zhicheng Zhang
Zhao Zhang
Yuting Zhao
Gang Zhou
Xingwei Zhang
Xinwang Liu
Xiaolong Zheng
VLM
402
4
0
21 Feb 2025
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
Cheng Luo
Zefan Cai
Hanshi Sun
Jinqi Xiao
Bo Yuan
Wen Xiao
Junjie Hu
Jiawei Zhao
Beidi Chen
Julius Berner
307
6
0
18 Feb 2025
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Sifan Zhou
Shuo Wang
Zhihang Yuan
Mingjia Shi
Yuzhang Shang
Dawei Yang
MQALM
582
11
0
18 Feb 2025
GoRA: Gradient-driven Adaptive Low Rank Adaptation
GoRA: Gradient-driven Adaptive Low Rank Adaptation
Haonan He
Peng Ye
Yuchen Ren
Yuan Yuan
Luyang Zhou
Shucun Ju
Lei Chen
AI4TSAI4CE
1.1K
4
0
13 Feb 2025
Gradient Multi-Normalization for Stateless and Scalable LLM Training
Gradient Multi-Normalization for Stateless and Scalable LLM Training
M. Scetbon
Chao Ma
Wenbo Gong
Edward Meeds
490
3
0
10 Feb 2025
The Curse of Depth in Large Language Models
The Curse of Depth in Large Language Models
Wenfang Sun
Xinyuan Song
Pengxiang Li
Lu Yin
Yefeng Zheng
Shiwei Liu
407
21
0
09 Feb 2025
CE-LoRA: Computation-Efficient LoRA Fine-Tuning for Language Models
CE-LoRA: Computation-Efficient LoRA Fine-Tuning for Language Models
Guanduo Chen
Yutong He
Yipeng Hu
Kun Yuan
Binhang Yuan
251
5
0
03 Feb 2025
SubTrack++ : Gradient Subspace Tracking for Scalable LLM Training
SubTrack++ : Gradient Subspace Tracking for Scalable LLM Training
Sahar Rajabi
Nayeema Nonta
Sirisha Rambhatla
527
1
0
03 Feb 2025
Memory-Efficient Fine-Tuning of Transformers via Token Selection
Memory-Efficient Fine-Tuning of Transformers via Token SelectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Antoine Simoulin
Namyong Park
Xiaoyi Liu
Grey Yang
427
6
0
31 Jan 2025
Wavelet Meets Adam: Compressing Gradients for Memory-Efficient Training
Wavelet Meets Adam: Compressing Gradients for Memory-Efficient Training
Ziqing Wen
Ping Luo
Jun Wang
Xiaoge Deng
Jinping Zou
Kun Yuan
Tao Sun
Dongsheng Li
CLL
343
0
0
13 Jan 2025
MPCache: MPC-Friendly KV Cache Eviction for Efficient Private LLM Inference
MPCache: MPC-Friendly KV Cache Eviction for Efficient Private LLM Inference
Wenxuan Zeng
Ye Dong
Jinjin Zhou
Jin Tan
Jin Tan
Tao Wei
Runsheng Wang
Meng Li
339
1
0
12 Jan 2025
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM TrainingInternational Conference on Learning Representations (ICLR), 2025
Tianjin Huang
Ziquan Zhu
Gaojie Jin
Lu Liu
Zinan Lin
Shiwei Liu
397
15
0
12 Jan 2025
TensorGRaD: Tensor Gradient Robust Decomposition for Memory-Efficient Neural Operator Training
TensorGRaD: Tensor Gradient Robust Decomposition for Memory-Efficient Neural Operator Training
Robert Joseph George
David Pitt
Jiawei Zhao
Jean Kossaifi
Cheng Luo
Yuandong Tian
Julius Berner
Anima Anandkumar
321
2
0
04 Jan 2025
GaLore$+$: Boosting Low-Rank Adaptation for LLMs with Cross-Head Projection
GaLore+++: Boosting Low-Rank Adaptation for LLMs with Cross-Head Projection
Xutao Liao
Shaohui Li
Yuhui Xu
Zhi Li
Zichen Liu
You He
VLM
279
7
0
31 Dec 2024
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-TuningInternational Conference on Learning Representations (ICLR), 2024
Yehonathan Refael
Jonathan Svirsky
Boris Shustin
Wasim Huleihel
Ofir Lindenbaum
304
10
0
31 Dec 2024
Grams: Gradient Descent with Adaptive Momentum Scaling
Grams: Gradient Descent with Adaptive Momentum Scaling
Yang Cao
Xiaoyu Li
Zhao Song
ODL
509
5
0
22 Dec 2024
Domain-adaptative Continual Learning for Low-resource Tasks: Evaluation
  on Nepali
Domain-adaptative Continual Learning for Low-resource Tasks: Evaluation on Nepali
Sharad Duwal
Suraj Prasai
Suresh Manandhar
CLL
309
3
0
18 Dec 2024
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LNInternational Conference on Learning Representations (ICLR), 2024
Pengxiang Li
Lu Yin
Shiwei Liu
297
11
0
18 Dec 2024
Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning
Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning
Kaustubh Ponkshe
Raghav Singhal
Eduard A. Gorbunov
Alexey Tumanov
Samuel Horváth
Praneeth Vepakomma
761
12
0
29 Nov 2024
COAP: Memory-Efficient Training with Correlation-Aware Gradient ProjectionComputer Vision and Pattern Recognition (CVPR), 2024
Jinqi Xiao
S. Sang
Tiancheng Zhi
Jing Liu
Qing Yan
Linjie Luo
Bo Yuan
Bo Yuan
VLM
419
6
0
26 Nov 2024
Cautious Optimizers: Improving Training with One Line of Code
Cautious Optimizers: Improving Training with One Line of Code
Kaizhao Liang
Lizhang Chen
B. Liu
Qiang Liu
ODL
717
21
0
25 Nov 2024
Reassessing Layer Pruning in LLMs: New Insights and Methods
Reassessing Layer Pruning in LLMs: New Insights and Methods
Yao Lu
Hao Cheng
Yujie Fang
Zeyu Wang
Jiaheng Wei
Dongwei Xu
Qi Xuan
Xiaoniu Yang
Zhaowei Zhu
340
16
0
23 Nov 2024
On the Impact of Fine-Tuning on Chain-of-Thought Reasoning
On the Impact of Fine-Tuning on Chain-of-Thought ReasoningNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Elita Lobo
Chirag Agarwal
Himabindu Lakkaraju
LRM
485
23
0
22 Nov 2024
FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training
FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training
Philip Zmushko
Aleksandr Beznosikov
Martin Takáč
Samuel Horváth
309
4
0
12 Nov 2024
Scalable Efficient Training of Large Language Models with
  Low-dimensional Projected Attention
Scalable Efficient Training of Large Language Models with Low-dimensional Projected AttentionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Xingtai Lv
Ning Ding
Kaiyan Zhang
Ermo Hua
Ganqu Cui
Bowen Zhou
213
7
0
04 Nov 2024
KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and
  Knowledge Distillation
KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation
Rambod Azimi
Rishav Rishav
M. Teichmann
Samira Ebrahimi Kahou
ALM
312
3
0
28 Oct 2024
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression
  of Neural Networks
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
MQ
225
4
0
28 Oct 2024
Understanding Adam Requires Better Rotation Dependent Assumptions
Understanding Adam Requires Better Rotation Dependent Assumptions
Tianyue H. Zhang
Lucas Maes
Alexia Jolicoeur-Martineau
Alexia Jolicoeur-Martineau
Damien Scieur
Damien Scieur
Simon Lacoste-Julien
Charles Guille-Escuret
309
6
0
25 Oct 2024
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 TrainingInternational Conference on Learning Representations (ICLR), 2024
Haocheng Xi
Han Cai
Ligeng Zhu
Yaojie Lu
Kurt Keutzer
Jianfei Chen
Song Han
MQ
494
18
0
25 Oct 2024
GeoLoRA: Geometric integration for parameter efficient fine-tuning
GeoLoRA: Geometric integration for parameter efficient fine-tuningInternational Conference on Learning Representations (ICLR), 2024
Steffen Schotthöfer
Emanuele Zangrando
Gianluca Ceruti
Francesco Tudisco
J. Kusch
AI4CE
185
7
0
24 Oct 2024
FairLoRA: Unpacking Bias Mitigation in Vision Models with
  Fairness-Driven Low-Rank Adaptation
FairLoRA: Unpacking Bias Mitigation in Vision Models with Fairness-Driven Low-Rank Adaptation
Rohan Sukumaran
Aarash Feizi
Adriana Romero-Sorian
G. Farnadi
291
3
0
22 Oct 2024
Natural GaLore: Accelerating GaLore for memory-efficient LLM Training
  and Fine-tuning
Natural GaLore: Accelerating GaLore for memory-efficient LLM Training and Fine-tuning
Arijit Das
140
2
0
21 Oct 2024
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
LDAdam: Adaptive Optimization from Low-Dimensional Gradient StatisticsInternational Conference on Learning Representations (ICLR), 2024
Thomas Robert
M. Safaryan
Ionut-Vlad Modoranu
Dan Alistarh
ODL
460
21
0
21 Oct 2024
CompAct: Compressed Activations for Memory-Efficient LLM Training
CompAct: Compressed Activations for Memory-Efficient LLM TrainingNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Yara Shamshoum
Nitzan Hodos
Yuval Sieradzki
Assaf Schuster
MQVLM
309
6
0
20 Oct 2024
Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for
  Fine-Tuning Large Language Models
Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models
Fei Wang
Li Shen
Liang Ding
Chao Xue
Ye Liu
Changxing Ding
267
4
0
13 Oct 2024
ALLoRA: Adaptive Learning Rate Mitigates LoRA Fatal Flaws
ALLoRA: Adaptive Learning Rate Mitigates LoRA Fatal Flaws
Hai Huang
Randall Balestriero
200
1
0
13 Oct 2024
MoIN: Mixture of Introvert Experts to Upcycle an LLM
MoIN: Mixture of Introvert Experts to Upcycle an LLM
Ajinkya Tejankar
K. Navaneet
Ujjawal Panchal
Kossar Pourahmadi
Hamed Pirsiavash
MoE
340
0
0
13 Oct 2024
Parameter-Efficient Fine-Tuning of Large Language Models using Semantic
  Knowledge Tuning
Parameter-Efficient Fine-Tuning of Large Language Models using Semantic Knowledge TuningScientific Reports (Sci Rep), 2024
Nusrat Jahan Prottasha
Asif Mahmud
Md. Shohanur Islam Sobuj
Prakash Bhat
Md. Kowsher
Niloofar Yousefi
O. Garibay
299
19
0
11 Oct 2024
Zeroth-Order Fine-Tuning of LLMs in Random Subspaces
Zeroth-Order Fine-Tuning of LLMs in Random Subspaces
Ziming Yu
Pan Zhou
Sike Wang
Jia Li
Hua Huang
Hua Huang
368
0
0
11 Oct 2024
Full-Rank No More: Low-Rank Weight Training for Modern Speech
  Recognition Models
Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Adriana Fernandez-Lopez
Shiwei Liu
L. Yin
Stavros Petridis
Maja Pantic
200
2
0
10 Oct 2024
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and
  Performance of SGD for Fine-Tuning Language Models
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Zeman Li
Xinwei Zhang
Peilin Zhong
Yuan Deng
Meisam Razaviyayn
Vahab Mirrokni
286
11
0
09 Oct 2024
Previous
12345
Next
Page 3 of 5