ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.03507
  4. Cited By
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
v1v2 (latest)

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

6 March 2024
Jiawei Zhao
Zhenyu Zhang
Beidi Chen
Zinan Lin
A. Anandkumar
Yuandong Tian
ArXiv (abs)PDFHTMLHuggingFace (189 upvotes)

Papers citing "GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection"

50 / 219 papers shown
PEFT-Factory: Unified Parameter-Efficient Fine-Tuning of Autoregressive Large Language Models
PEFT-Factory: Unified Parameter-Efficient Fine-Tuning of Autoregressive Large Language Models
Róbert Belanec
Ivan Srba
Maria Bielikova
ALM
440
0
0
02 Dec 2025
ZO-ASR: Zeroth-Order Fine-Tuning of Speech Foundation Models without Back-Propagation
Yuezhang Peng
Yu‐Xin Liu
Yao Li
S. Wang
Fei Wen
Xie Chen
111
0
0
01 Dec 2025
The Path Not Taken: RLVR Provably Learns Off the Principals
The Path Not Taken: RLVR Provably Learns Off the Principals
Hanqing Zhu
Zhenyu Zhang
Hanxian Huang
DiJia Su
Zechun Liu
...
Jinwon Lee
David Z. Pan
Zinan Lin
Yuandong Tian
Kai Sheng Tai
198
4
0
11 Nov 2025
EcoSpa: Efficient Transformer Training with Coupled Sparsity
EcoSpa: Efficient Transformer Training with Coupled Sparsity
Jinqi Xiao
Cheng Luo
Lingyi Huang
Cheng Yang
Yang Sui
...
Xiao Zang
Yibiao Ying
Zhexiang Tang
A. Anandkumar
Bo Yuan
81
1
0
09 Nov 2025
Subsampled Randomized Fourier GaLore for Adapting Foundation Models in Depth-Driven Liver Landmark Segmentation
Subsampled Randomized Fourier GaLore for Adapting Foundation Models in Depth-Driven Liver Landmark Segmentation
Yun-Chen Lin
Jiayuan Huang
Hanyuan Zhang
Sergi Kavtaradze
Matthew J. Clarkson
Mobarak I. Hoque
3DGSMedIm
102
0
0
05 Nov 2025
An All-Reduce Compatible Top-K Compressor for Communication-Efficient Distributed Learning
An All-Reduce Compatible Top-K Compressor for Communication-Efficient Distributed Learning
Chuyan Chen
Chenyang Ma
Zhangxin Li
Yutong He
Yanjie Dong
Kun Yuan
281
0
0
30 Oct 2025
IBNorm: Information-Bottleneck Inspired Normalization for Representation Learning
IBNorm: Information-Bottleneck Inspired Normalization for Representation Learning
Xiandong Zou
Pan Zhou
103
0
0
29 Oct 2025
MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling
MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling
Yuxi Liu
Renjia Deng
Yutong He
Xue Wang
Tao Yao
Kun Yuan
148
0
0
28 Oct 2025
From Memorization to Reasoning in the Spectrum of Loss Curvature
From Memorization to Reasoning in the Spectrum of Loss Curvature
Jack Merullo
Srihita Vatsavaya
Lucius Bushnaq
Owen Lewis
211
1
0
28 Oct 2025
Kernelized Sparse Fine-Tuning with Bi-level Parameter Competition for Vision Models
Kernelized Sparse Fine-Tuning with Bi-level Parameter Competition for Vision Models
Shufan Shen
Junshu Sun
Shuhui Wang
Qingming Huang
136
0
0
28 Oct 2025
How do simple rotations affect the implicit bias of Adam?
How do simple rotations affect the implicit bias of Adam?
Adela DePavia
Vasileios Charisopoulos
Rebecca Willett
ODL
364
0
0
27 Oct 2025
Improving the Straight-Through Estimator with Zeroth-Order Information
Improving the Straight-Through Estimator with Zeroth-Order Information
Ningfeng Yang
Tor M. Aamodt
FedML
288
0
0
27 Oct 2025
Towards Fast LLM Fine-tuning through Zeroth-Order Optimization with Projected Gradient-Aligned Perturbations
Towards Fast LLM Fine-tuning through Zeroth-Order Optimization with Projected Gradient-Aligned Perturbations
Zhendong Mi
Qitao Tan
Grace Li Zhang
Zhaozhuo Xu
Geng Yuan
Shaoyi Huang
145
0
0
21 Oct 2025
Unbiased Gradient Low-Rank Projection
Unbiased Gradient Low-Rank Projection
Rui Pan
Yang Luo
Yuxing Liu
Yang You
Tong Zhang
149
0
0
20 Oct 2025
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
Zhoutong Wu
Y. Zhang
Yiming Dong
Chenheng Zhang
Cong Fang
Kun Yuan
Zhouchen Lin
157
0
0
19 Oct 2025
A Guardrail for Safety Preservation: When Safety-Sensitive Subspace Meets Harmful-Resistant Null-Space
A Guardrail for Safety Preservation: When Safety-Sensitive Subspace Meets Harmful-Resistant Null-Space
Bingjie Zhang
Yibo Yang
Renzhe
Dandan Guo
Jindong Gu
Philip Torr
Bernard Ghanem
289
1
0
16 Oct 2025
Noise-Adaptive Layerwise Learning Rates: Accelerating Geometry-Aware Optimization for Deep Neural Network Training
Noise-Adaptive Layerwise Learning Rates: Accelerating Geometry-Aware Optimization for Deep Neural Network Training
Jie Hao
Xiaochuan Gong
Jie Xu
Z. Wang
Mingrui Liu
AI4CE
152
0
0
15 Oct 2025
AdaPM: a Partial Momentum Algorithm for LLM Training
AdaPM: a Partial Momentum Algorithm for LLM Training
Yimu Zhang
Yuanshi Liu
Cong Fang
146
0
0
10 Oct 2025
ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement
ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement
Habin Lim
Yeongseob Won
Juwon Seo
Gyeong-Moon Park
165
0
0
06 Oct 2025
QDeepGR4J: Quantile-based ensemble of deep learning and GR4J hybrid rainfall-runoff models for extreme flow prediction with uncertainty quantification
QDeepGR4J: Quantile-based ensemble of deep learning and GR4J hybrid rainfall-runoff models for extreme flow prediction with uncertainty quantification
Arpit Kapoor
Rohitash Chandra
123
3
0
06 Oct 2025
REG: A Regularization Optimizer for Robust Training Dynamics
REG: A Regularization Optimizer for Robust Training Dynamics
Zehua Liu
Han Wu
Xiaojin Fu
Shuqi Liu
Xiongwei Han
Tao Zhong
Mingxuan Yuan
111
0
0
04 Oct 2025
Optimizing Fine-Tuning through Advanced Initialization Strategies for Low-Rank Adaptation
Optimizing Fine-Tuning through Advanced Initialization Strategies for Low-Rank Adaptation
Yongfu Xue
AI4CE
143
0
0
04 Oct 2025
Memory-Efficient Backpropagation for Fine-Tuning LLMs on Resource-Constrained Mobile Devices
Memory-Efficient Backpropagation for Fine-Tuning LLMs on Resource-Constrained Mobile Devices
Congzheng Song
Xinyu Tang
123
0
0
03 Oct 2025
Randomized Gradient Subspaces for Efficient Large Language Model Training
Randomized Gradient Subspaces for Efficient Large Language Model Training
Sahar Rajabi
Nayeema Nonta
Samanvay Vajpayee
Sirisha Rambhatla
111
0
0
02 Oct 2025
Sample-Efficient Differentially Private Fine-Tuning via Gradient Matrix Denoising
Sample-Efficient Differentially Private Fine-Tuning via Gradient Matrix Denoising
Ali Dadsetan
Frank Rudzicz
95
0
0
01 Oct 2025
Finetune Once: Decoupling General & Domain Learning with Dynamic Boosted Annealing
Finetune Once: Decoupling General & Domain Learning with Dynamic Boosted Annealing
Yang Tang
Ruijie Liu
Yifan Wang
Shiyu Li
Xi Chen
114
0
0
30 Sep 2025
PrunedLoRA: Robust Gradient-Based structured pruning for Low-rank Adaptation in Fine-tuning
PrunedLoRA: Robust Gradient-Based structured pruning for Low-rank Adaptation in Fine-tuning
Xin Yu
Cong Xie
Ziyu Zhao
Tiantian Fan
Lingzhou Xue
Zhi-Li Zhang
246
0
0
30 Sep 2025
Conda: Column-Normalized Adam for Training Large Language Models Faster
Conda: Column-Normalized Adam for Training Large Language Models Faster
Junjie Wang
Pan Zhou
Yiming Dong
Huan Li
Jia Li
Xun Zhou
Qicheng Lao
Cong Fang
Zhouchen Lin
AI4CE
246
0
0
29 Sep 2025
Effective Quantization of Muon Optimizer States
Effective Quantization of Muon Optimizer States
Aman Gupta
Rafael Celente
Abhishek Shivanna
D. T. Braithwaite
Gregory Dexter
Shao Tang
Hiroto Udagawa
Daniel Silva
R. Ramanath
S. Keerthi
MQ
142
0
0
27 Sep 2025
Memory-Efficient Fine-Tuning via Low-Rank Activation Compression
Memory-Efficient Fine-Tuning via Low-Rank Activation Compression
Jiang-Xin Shi
Wen-Da Wei
Jin-Fei Qi
Xuanyu Chen
Tong Wei
Yu-Feng Li
131
0
0
27 Sep 2025
Partial Parameter Updates for Efficient Distributed Training
Partial Parameter Updates for Efficient Distributed Training
Anastasiia Filippova
Angelos Katharopoulos
David Grangier
Ronan Collobert
FedML
136
0
0
26 Sep 2025
Provable Scaling Laws of Feature Emergence from Learning Dynamics of Grokking
Provable Scaling Laws of Feature Emergence from Learning Dynamics of Grokking
Yuandong Tian
204
0
0
25 Sep 2025
No Prior, No Leakage: Revisiting Reconstruction Attacks in Trained Neural Networks
No Prior, No Leakage: Revisiting Reconstruction Attacks in Trained Neural Networks
Yehonatan Refael
Guy Smorodinsky
Ofir Lindenbaum
Itay Safran
MIACVAAML
305
0
0
25 Sep 2025
Faster Than SVD, Smarter Than SGD: The OPLoRA Alternating Update
Faster Than SVD, Smarter Than SGD: The OPLoRA Alternating Update
Abdulla Jasem Almansoori
Maria Ivanova
Andrey Veprikov
Aleksandr Beznosikov
Samuel Horvath
Martin Takáč
109
0
0
24 Sep 2025
CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure
CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure
Boao Kong
Junzhu Liang
Yuxi Liu
Renjia Deng
Kun Yuan
160
1
0
23 Sep 2025
Development of Deep Learning Optimizers: Approaches, Concepts, and Update Rules
Development of Deep Learning Optimizers: Approaches, Concepts, and Update Rules
Doğay Altınel
135
0
0
22 Sep 2025
BEFT: Bias-Efficient Fine-Tuning of Language Models
BEFT: Bias-Efficient Fine-Tuning of Language Models
Baichuan Huang
Ananth Balashankar
Amir Aminifar
127
0
0
19 Sep 2025
Distribution-Aligned Decoding for Efficient LLM Task Adaptation
Distribution-Aligned Decoding for Efficient LLM Task Adaptation
Senkang Hu
Xudong Han
Jinqi Jiang
Yihang Tao
Zihan Fang
Yong Dai
Sam Kwong
Yuguang Fang
239
2
0
19 Sep 2025
Low-rank surrogate modeling and stochastic zero-order optimization for training of neural networks with black-box layers
Low-rank surrogate modeling and stochastic zero-order optimization for training of neural networks with black-box layers
Andrei Chertkov
Artem Basharin
Mikhail Saygin
Evgeny Frolov
Stanislav Straupe
Ivan Oseledets
142
0
0
18 Sep 2025
Low-rank Orthogonalization for Large-scale Matrix Optimization with Applications to Foundation Model Training
Low-rank Orthogonalization for Large-scale Matrix Optimization with Applications to Foundation Model Training
Chuan He
Zhanwang Deng
Zhaosong Lu
BDL
168
2
0
15 Sep 2025
From PowerSGD to PowerSGD+: Low-Rank Gradient Compression for Distributed Optimization with Convergence Guarantees
From PowerSGD to PowerSGD+: Low-Rank Gradient Compression for Distributed Optimization with Convergence Guarantees
Shengping Xie
Chuyan Chen
Kun Yuan
117
0
0
14 Sep 2025
An Efficient Subspace Algorithm for Federated Learning on Heterogeneous Data
An Efficient Subspace Algorithm for Federated Learning on Heterogeneous Data
Jiaojiao Zhang
Yuqi Xu
Kun Yuan
FedML
120
0
0
05 Sep 2025
ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
Yiming Du
Yifan Xiang
Bin Liang
Dahua Lin
Kam-Fai Wong
Fei Tan
OffRL
178
1
0
27 Aug 2025
DropLoRA: Sparse Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
DropLoRA: Sparse Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
Haojie Zhang
92
2
0
24 Aug 2025
Empowering Multimodal LLMs with External Tools: A Comprehensive Survey
Empowering Multimodal LLMs with External Tools: A Comprehensive Survey
Wenbin An
Jiahao Nie
Yaqiang Wu
Feng Tian
Shijian Lu
Q. Zheng
MLLM
182
1
0
14 Aug 2025
LLMC+: Benchmarking Vision-Language Model Compression with a Plug-and-play Toolkit
LLMC+: Benchmarking Vision-Language Model Compression with a Plug-and-play Toolkit
Chengtao Lv
Bilang Zhang
Yang Yong
Yazhe Niu
Yushi Huang
Shiqiao Gu
Jiajun Wu
Yumeng Shi
Jinyang Guo
Wenya Wang
MLLMVLM
162
0
0
13 Aug 2025
LOST: Low-rank and Sparse Pre-training for Large Language Models
LOST: Low-rank and Sparse Pre-training for Large Language Models
Jiaxi Li
Lu Yin
Li Shen
Jinjin Xu
Liwu Xu
Tianjin Huang
Wenwu Wang
Shiwei Liu
Xilu Wang
152
2
0
04 Aug 2025
Efficiently Seeking Flat Minima for Better Generalization in Fine-Tuning Large Language Models and Beyond
Efficiently Seeking Flat Minima for Better Generalization in Fine-Tuning Large Language Models and Beyond
Jiaxin Deng
Qingcheng Zhu
Junbiao Pang
Linlin Yang
Zhongqian Fu
Baochang Zhang
150
0
0
01 Aug 2025
From LLMs to Edge: Parameter-Efficient Fine-Tuning on Edge Devices
From LLMs to Edge: Parameter-Efficient Fine-Tuning on Edge Devices
Georg Slamanig
Francesco Corti
O. Saukh
113
0
0
31 Jul 2025
TorchAO: PyTorch-Native Training-to-Serving Model Optimization
TorchAO: PyTorch-Native Training-to-Serving Model Optimization
Andrew Or
Apurva Jain
Daniel Vega-Myhre
Jesse Cai
Charles David Hernandez
...
Christian Puhrsch
Mark Saroufim
Supriya Rao
Thien Tran
Aleksandar Samardžić
MQ
171
4
0
21 Jul 2025
12345
Next