ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.10195
  4. Cited By
AdaLomo: Low-memory Optimization with Adaptive Learning Rate

AdaLomo: Low-memory Optimization with Adaptive Learning Rate

16 October 2023
Kai Lv
Hang Yan
Qipeng Guo
Haijun Lv
Xipeng Qiu
    ODL
ArXivPDFHTML

Papers citing "AdaLomo: Low-memory Optimization with Adaptive Learning Rate"

18 / 18 papers shown
Title
LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM
LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM
Yehonathan Refael
Iftach Arbel
Ofir Lindenbaum
Tom Tirer
64
0
0
26 Feb 2025
Slamming: Training a Speech Language Model on One GPU in a Day
Slamming: Training a Speech Language Model on One GPU in a Day
Gallil Maimon
Avishai Elmakies
Yossi Adi
38
3
0
19 Feb 2025
Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition
Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition
Robert Joseph George
David Pitt
Jiawei Zhao
Jean Kossaifi
Cheng Luo
Yuandong Tian
Anima Anandkumar
26
1
0
04 Jan 2025
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
Yehonathan Refael
Jonathan Svirsky
Boris Shustin
Wasim Huleihel
Ofir Lindenbaum
32
3
0
31 Dec 2024
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
Thomas Robert
M. Safaryan
Ionut-Vlad Modoranu
Dan Alistarh
ODL
31
2
0
21 Oct 2024
SOAP: Improving and Stabilizing Shampoo using Adam
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas
Depen Morwani
Rosie Zhao
Itai Shapira
David Brandfonbrener
Lucas Janson
Sham Kakade
Sham Kakade
57
23
0
17 Sep 2024
Achieving Peak Performance for Large Language Models: A Systematic
  Review
Achieving Peak Performance for Large Language Models: A Systematic Review
Z. R. K. Rostam
Sándor Szénási
Gábor Kertész
26
3
0
07 Sep 2024
MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long
  Sequences Training
MINI-SEQUENCE TRANSFORMER: Optimizing Intermediate Memory for Long Sequences Training
Cheng Luo
Jiawei Zhao
Zhuoming Chen
Beidi Chen
A. Anandkumar
16
3
0
22 Jul 2024
Weak-to-Strong Reasoning
Weak-to-Strong Reasoning
Yuqing Yang
Yan Ma
Pengfei Liu
LRM
25
13
0
18 Jul 2024
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive
  Low-Rank Gradients
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Zhenyu (Allen) Zhang
Ajay Jaiswal
L. Yin
Shiwei Liu
Jiawei Zhao
Yuandong Tian
Zhangyang Wang
VLM
23
1
0
11 Jul 2024
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse
  Gradients
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients
Aashiq Muhamed
Oscar Li
David Woodruff
Mona Diab
Virginia Smith
37
7
0
25 Jun 2024
Adam-mini: Use Fewer Learning Rates To Gain More
Adam-mini: Use Fewer Learning Rates To Gain More
Yushun Zhang
Congliang Chen
Ziniu Li
Tian Ding
Chenwei Wu
Yinyu Ye
Zhi-Quan Luo
Ruoyu Sun
31
33
0
24 Jun 2024
Reflecting on the State of Rehearsal-free Continual Learning with
  Pretrained Models
Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models
Lukas Thede
Karsten Roth
Olivier J. Hénaff
Matthias Bethge
Zeynep Akata
CLL
29
5
0
13 Jun 2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Jiawei Zhao
Zhenyu (Allen) Zhang
Beidi Chen
Zhangyang Wang
A. Anandkumar
Yuandong Tian
27
173
0
06 Mar 2024
CoLLiE: Collaborative Training of Large Language Models in an Efficient
  Way
CoLLiE: Collaborative Training of Large Language Models in an Efficient Way
Kai Lv
Shuo Zhang
Tianle Gu
Shuhao Xing
Jiawei Hong
...
Tengxiao Liu
Yu Sun
Penousal Machado
Hang Yan
Xipeng Qiu
35
6
0
01 Dec 2023
Instruction Tuning with GPT-4
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
157
576
0
06 Apr 2023
BBTv2: Towards a Gradient-Free Future with Large Language Models
BBTv2: Towards a Gradient-Free Future with Large Language Models
Tianxiang Sun
Zhengfu He
Hong Qian
Yunhua Zhou
Xuanjing Huang
Xipeng Qiu
100
53
0
23 May 2022
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
1