ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.17333
  4. Cited By
Fine-Tuning Language Models with Just Forward Passes
v1v2v3 (latest)

Fine-Tuning Language Models with Just Forward Passes

Neural Information Processing Systems (NeurIPS), 2023
27 May 2023
Sadhika Malladi
Tianyu Gao
Eshaan Nichani
Alexandru Damian
Jason D. Lee
Danqi Chen
Sanjeev Arora
ArXiv (abs)PDFHTMLHuggingFace (3 upvotes)

Papers citing "Fine-Tuning Language Models with Just Forward Passes"

38 / 188 papers shown
Rethinking Machine Unlearning for Large Language Models
Rethinking Machine Unlearning for Large Language Models
Sijia Liu
Yuanshun Yao
Jinghan Jia
Stephen Casper
Nathalie Baracaldo
...
Hang Li
Kush R. Varshney
Mohit Bansal
Sanmi Koyejo
Yang Liu
AILawMU
428
200
0
13 Feb 2024
Differentially Private Zeroth-Order Methods for Scalable Large Language
  Model Finetuning
Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning
Zhicheng Liu
Jian Lou
Wenxuan Bao
Yihan Hu
Baochun Li
Zhan Qin
K. Ren
422
13
0
12 Feb 2024
On the Convergence of Zeroth-Order Federated Tuning for Large Language
  Models
On the Convergence of Zeroth-Order Federated Tuning for Large Language Models
Zhenqing Ling
Daoyuan Chen
Liuyi Yao
Yaliang Li
Ying Shen
FedML
316
24
0
08 Feb 2024
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Lucio Dery
Steven Kolawole
Jean-Francois Kagey
Virginia Smith
Graham Neubig
Ameet Talwalkar
279
46
0
08 Feb 2024
The Fine-Grained Complexity of Gradient Computation for Training Large
  Language Models
The Fine-Grained Complexity of Gradient Computation for Training Large Language Models
Josh Alman
Zhao Song
217
23
0
07 Feb 2024
Flora: Low-Rank Adapters Are Secretly Gradient Compressors
Flora: Low-Rank Adapters Are Secretly Gradient CompressorsInternational Conference on Machine Learning (ICML), 2024
Yongchang Hao
Yanshuai Cao
Lili Mou
291
86
0
05 Feb 2024
Stochastic Two Points Method for Deep Model Zeroth-order Optimization
Stochastic Two Points Method for Deep Model Zeroth-order Optimization
Yijiang Pang
Jiayu Zhou
428
1
0
02 Feb 2024
HiFT: A Hierarchical Full Parameter Fine-Tuning Strategy
HiFT: A Hierarchical Full Parameter Fine-Tuning StrategyConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yongkang Liu
Yiqun Zhang
Qian Li
Tong Liu
Shi Feng
Daling Wang
Yifei Zhang
Hinrich Schütze
296
14
0
26 Jan 2024
Private Fine-tuning of Large Language Models with Zeroth-order Optimization
Private Fine-tuning of Large Language Models with Zeroth-order Optimization
Xinyu Tang
Ashwinee Panda
Milad Nasr
Saeed Mahloujifar
Prateek Mittal
566
40
0
09 Jan 2024
Chain of LoRA: Efficient Fine-tuning of Language Models via Residual
  Learning
Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning
Wenhan Xia
Chengwei Qin
Elad Hazan
248
84
0
08 Jan 2024
IoT in the Era of Generative AI: Vision and Challenges
IoT in the Era of Generative AI: Vision and ChallengesIEEE Internet Computing (IEEE Internet Comput.), 2024
Xin Wang
Zhongwei Wan
Arvin Hekmati
M. Zong
Samiul Alam
Mi Zhang
Bhaskar Krishnamachari
265
5
0
03 Jan 2024
ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and
  Uncertainty in Zeroth-order Optimization
ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-order Optimization
Shuoran Jiang
Qingcai Chen
Youcheng Pan
Yang Xiang
Yukang Lin
Xiangping Wu
Chuanyi Liu
Xiaobao Song
ODL
191
20
0
23 Dec 2023
Hazards from Increasingly Accessible Fine-Tuning of Downloadable
  Foundation Models
Hazards from Increasingly Accessible Fine-Tuning of Downloadable Foundation Models
Alan Chan
Ben Bucknall
Herbie Bradley
David M. Krueger
248
7
0
22 Dec 2023
Training Convolutional Neural Networks with the Forward-Forward algorithm
Training Convolutional Neural Networks with the Forward-Forward algorithmScientific Reports (Sci Rep), 2023
Riccardo Scodellaro
A. Kulkarni
Frauke Alves
Matthias Schröter
375
13
0
22 Dec 2023
Federated Full-Parameter Tuning of Billion-Sized Language Models with
  Communication Cost under 18 Kilobytes
Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes
Zhen Qin
Daoyuan Chen
Bingchen Qian
Bolin Ding
Yaliang Li
Shuiguang Deng
FedML
366
58
0
11 Dec 2023
Batched Low-Rank Adaptation of Foundation Models
Batched Low-Rank Adaptation of Foundation ModelsInternational Conference on Learning Representations (ICLR), 2023
Yeming Wen
Swarat Chaudhuri
OffRL
311
28
0
09 Dec 2023
f-FERM: A Scalable Framework for Robust Fair Empirical Risk Minimization
f-FERM: A Scalable Framework for Robust Fair Empirical Risk MinimizationInternational Conference on Learning Representations (ICLR), 2023
Sina Baharlouei
Shivam Patel
Meisam Razaviyayn
403
4
0
06 Dec 2023
PrivateLoRA For Efficient Privacy Preserving LLM
PrivateLoRA For Efficient Privacy Preserving LLM
Yiming Wang
Yu Lin
Xiaodong Zeng
Guannan Zhang
275
25
0
23 Nov 2023
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Yiming Wang
Yu Lin
Xiaodong Zeng
Guannan Zhang
MoMe
265
32
0
20 Nov 2023
The Expressibility of Polynomial based Attention Scheme
The Expressibility of Polynomial based Attention Scheme
Zhao Song
Guangyi Xu
Junze Yin
313
7
0
30 Oct 2023
Learning to (Learn at Test Time)
Learning to (Learn at Test Time)
Yu Sun
Xinhao Li
Karan Dalal
Chloe Hsu
Oluwasanmi Koyejo
Carlos Guestrin
Xiaolong Wang
Tatsunori Hashimoto
Xinlei Chen
SSL
322
11
0
20 Oct 2023
AdaLomo: Low-memory Optimization with Adaptive Learning Rate
AdaLomo: Low-memory Optimization with Adaptive Learning RateAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Kai Lv
Hang Yan
Qipeng Guo
Haijun Lv
Xipeng Qiu
ODL
307
29
0
16 Oct 2023
DPZero: Private Fine-Tuning of Language Models without Backpropagation
DPZero: Private Fine-Tuning of Language Models without Backpropagation
Liang Zhang
Bingcong Li
K. K. Thekumparampil
Sewoong Oh
Niao He
446
22
0
14 Oct 2023
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language
  Models
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Yi-Lin Sung
Jaehong Yoon
Mohit Bansal
VLM
273
20
0
04 Oct 2023
DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training
DeepZero: Scaling up Zeroth-Order Optimization for Deep Model TrainingInternational Conference on Learning Representations (ICLR), 2023
Chenyi Zi
Yimeng Zhang
Jinghan Jia
James Diffenderfer
Jiancheng Liu
Konstantinos Parasyris
Yihua Zhang
Zheng Zhang
B. Kailkhura
Sijia Liu
637
78
0
03 Oct 2023
Towards Green AI in Fine-tuning Large Language Models via Adaptive
  Backpropagation
Towards Green AI in Fine-tuning Large Language Models via Adaptive BackpropagationInternational Conference on Learning Representations (ICLR), 2023
Kai Huang
Hanyu Yin
Heng Huang
Wei Gao
255
17
0
22 Sep 2023
A Fast Optimization View: Reformulating Single Layer Attention in LLM
  Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
A Fast Optimization View: Reformulating Single Layer Attention in LLM Based on Tensor and SVM Trick, and Solving It in Matrix Multiplication Time
Yeqi Gao
Zhao Song
Weixin Wang
Junze Yin
297
30
0
14 Sep 2023
FwdLLM: Efficient FedLLM using Forward Gradient
FwdLLM: Efficient FedLLM using Forward Gradient
Mengwei Xu
Dongqi Cai
Yaozong Wu
Xiang Li
Shangguang Wang
FedML
251
34
0
26 Aug 2023
How to Protect Copyright Data in Optimization of Large Language Models?
How to Protect Copyright Data in Optimization of Large Language Models?AAAI Conference on Artificial Intelligence (AAAI), 2023
T. Chu
Zhao Song
Chiwun Yang
216
39
0
23 Aug 2023
Tensor-Compressed Back-Propagation-Free Training for (Physics-Informed)
  Neural Networks
Tensor-Compressed Back-Propagation-Free Training for (Physics-Informed) Neural Networks
Yequan Zhao
Xinling Yu
Zhixiong Chen
Ziyue Liu
Sijia Liu
Zheng Zhang
PINN
192
14
0
18 Aug 2023
Convergence of Two-Layer Regression with Nonlinear Units
Convergence of Two-Layer Regression with Nonlinear Units
Yichuan Deng
Zhao Song
Shenghao Xie
200
8
0
16 Aug 2023
Zero-th Order Algorithm for Softmax Attention Optimization
Zero-th Order Algorithm for Softmax Attention OptimizationBigData Congress [Services Society] (BSS), 2023
Yichuan Deng
Zhihang Li
Sridhar Mahadevan
Zhao Song
202
18
0
17 Jul 2023
An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth
  Nonconvex Stochastic Optimization
An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic OptimizationJournal of machine learning research (JMLR), 2023
Guy Kornowski
Ohad Shamir
332
26
0
10 Jul 2023
ChatGPT in the Age of Generative AI and Large Language Models: A Concise Survey
S. Mohamadi
Ghulam Mujtaba
Ngan Le
Gianfranco Doretto
Don Adjeroh
LM&MAAI4MH
302
37
0
09 Jul 2023
Trainable Transformer in Transformer
Trainable Transformer in TransformerInternational Conference on Machine Learning (ICML), 2023
A. Panigrahi
Sadhika Malladi
Mengzhou Xia
Sanjeev Arora
VLM
353
14
0
03 Jul 2023
Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized
  Language Model Finetuning Using Shared Randomness
Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness
E. Zelikman
Qian Huang
Abigail Z. Jacobs
Nick Haber
Noah D. Goodman
181
20
0
16 Jun 2023
Full Parameter Fine-tuning for Large Language Models with Limited
  Resources
Full Parameter Fine-tuning for Large Language Models with Limited ResourcesAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Kai Lv
Yuqing Yang
Tengxiao Liu
Qi-jie Gao
Qipeng Guo
Xipeng Qiu
330
186
0
16 Jun 2023
A New Linear Scaling Rule for Private Adaptive Hyperparameter
  Optimization
A New Linear Scaling Rule for Private Adaptive Hyperparameter OptimizationInternational Conference on Machine Learning (ICML), 2022
Ashwinee Panda
Xinyu Tang
Saeed Mahloujifar
Vikash Sehwag
Prateek Mittal
341
15
0
08 Dec 2022
Previous
1234