ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.17333
  4. Cited By
Fine-Tuning Language Models with Just Forward Passes
v1v2v3 (latest)

Fine-Tuning Language Models with Just Forward Passes

Neural Information Processing Systems (NeurIPS), 2023
27 May 2023
Sadhika Malladi
Tianyu Gao
Eshaan Nichani
Alexandru Damian
Jason D. Lee
Danqi Chen
Sanjeev Arora
ArXiv (abs)PDFHTMLHuggingFace (3 upvotes)

Papers citing "Fine-Tuning Language Models with Just Forward Passes"

50 / 188 papers shown
Towards Efficient Large Language Models for Scientific Text: A Review
Towards Efficient Large Language Models for Scientific Text: A Review
H. To
Ming Liu
Guangyan Huang
181
3
0
20 Aug 2024
Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches
Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches
Yanjie Dong
Xiaoyi Fan
Fangxin Wang
Chengming Li
Victor C. M. Leung
Xiping Hu
250
11
0
20 Aug 2024
Parameter-Efficient Fine-Tuning via Circular Convolution
Parameter-Efficient Fine-Tuning via Circular Convolution
Chenyi Zi
Jiashun Cheng
Zijing Liu
Ziqi Gao
Fugee Tsung
Yu-Feng Li
Jia Li
502
4
0
27 Jul 2024
Improving GPU Multi-Tenancy Through Dynamic Multi-Instance GPU
  Reconfiguration
Improving GPU Multi-Tenancy Through Dynamic Multi-Instance GPU Reconfiguration
Tianyu Wang
Sheng Li
Bingyao Li
Yuezhen Dai
Ao Li
Geng Yuan
Yufei Ding
Youtao Zhang
Xulong Tang
231
9
0
18 Jul 2024
Online Pseudo-Zeroth-Order Training of Neuromorphic Spiking Neural
  Networks
Online Pseudo-Zeroth-Order Training of Neuromorphic Spiking Neural Networks
Mingqing Xiao
Qingyan Meng
Zongpeng Zhang
D.K. He
Zhouchen Lin
271
1
0
17 Jul 2024
MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models
MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models
Hongrong Cheng
Miao Zhang
J. Q. Shi
297
6
0
16 Jul 2024
LoRA-PT: Low-Rank Adapting UNETR for Hippocampus Segmentation Using Principal Tensor Singular Values and Vectors
LoRA-PT: Low-Rank Adapting UNETR for Hippocampus Segmentation Using Principal Tensor Singular Values and Vectors
Guanghua He
Wangang Cheng
Hancan Zhu
Gaohang Yu
351
3
0
16 Jul 2024
Mobile Edge Intelligence for Large Language Models: A Contemporary Survey
Mobile Edge Intelligence for Large Language Models: A Contemporary Survey
Guanqiao Qu
Qiyuan Chen
Wei Wei
Zheng Lin
Xianhao Chen
Kaibin Huang
544
155
0
09 Jul 2024
Expressive and Generalizable Low-rank Adaptation for Large Models via
  Slow Cascaded Learning
Expressive and Generalizable Low-rank Adaptation for Large Models via Slow Cascaded Learning
Siwei Li
Yifan Yang
Yifei Shen
Fangyun Wei
Zongqing Lu
L. Qiu
Yuqing Yang
AI4CE
205
5
0
01 Jul 2024
PocketLLM: Enabling On-Device Fine-Tuning for Personalized LLMs
PocketLLM: Enabling On-Device Fine-Tuning for Personalized LLMs
Dan Peng
Zhihui Fu
Jun Wang
199
21
0
01 Jul 2024
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models:
  Enhancing Performance and Reducing Inference Costs
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
Enshu Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Matthew B. Blaschko
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MoE
229
21
0
01 Jul 2024
AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for
  Memory-Efficient Large Language Models Fine-Tuning
AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tuning
Yifan Yang
Kai Zhen
Ershad Banijamal
Athanasios Mouchtaris
Zheng Zhang
139
20
0
26 Jun 2024
Adam-mini: Use Fewer Learning Rates To Gain More
Adam-mini: Use Fewer Learning Rates To Gain More
Yushun Zhang
Congliang Chen
Ziniu Li
Tian Ding
Chenwei Wu
Yinyu Ye
Zhi-Quan Luo
Tian Ding
446
84
0
24 Jun 2024
Rethinking Pruning Large Language Models: Benefits and Pitfalls of
  Reconstruction Error Minimization
Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization
Sungbin Shin
Wonpyo Park
Jaeho Lee
Namhoon Lee
176
6
0
21 Jun 2024
Communication-Efficient Byzantine-Resilient Federated Zero-Order
  Optimization
Communication-Efficient Byzantine-Resilient Federated Zero-Order Optimization
Afonso de Sá Delgado Neto
Maximilian Egger
Mayank Bakshi
Rawad Bitar
FedMLAI4CE
134
3
0
20 Jun 2024
Memory-Efficient Gradient Unrolling for Large-Scale Bi-level
  Optimization
Memory-Efficient Gradient Unrolling for Large-Scale Bi-level Optimization
Qianli Shen
Yezhen Wang
Zhouhao Yang
Xiang Li
Haonan Wang
Yang Zhang
Jonathan Scarlett
Zhanxing Zhu
Kenji Kawaguchi
AI4CE
241
10
0
20 Jun 2024
Synergizing Foundation Models and Federated Learning: A Survey
Synergizing Foundation Models and Federated Learning: A Survey
Shenghui Li
Fanghua Ye
Meng Fang
Jiaxu Zhao
Yun-Hin Chan
Edith C. -H. Ngai
Thiemo Voigt
AI4CE
265
9
0
18 Jun 2024
DIEKAE: Difference Injection for Efficient Knowledge Augmentation and
  Editing of Large Language Models
DIEKAE: Difference Injection for Efficient Knowledge Augmentation and Editing of Large Language Models
Alessio Galatolo
Meriem Beloucif
Katie Winkle
163
0
0
15 Jun 2024
Minimizing Energy Costs in Deep Learning Model Training: The Gaussian
  Sampling Approach
Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach
Challapalli Phanindra Revanth
Sumohana S. Channappayya
C Krishna Mohan
205
23
0
11 Jun 2024
Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity
Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity
Wentao Guo
Jikai Long
Yimeng Zeng
Zirui Liu
Xinyu Yang
...
Osbert Bastani
Christopher De Sa
Xiaodong Yu
Beidi Chen
Zhaozhuo Xu
267
32
0
05 Jun 2024
Why Larger Language Models Do In-context Learning Differently?
Why Larger Language Models Do In-context Learning Differently?
Zhenmei Shi
Junyi Wei
Zhuoyan Xu
Yingyu Liang
268
45
0
30 May 2024
Double Variance Reduction: A Smoothing Trick for Composite Optimization
  Problems without First-Order Gradient
Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient
Hao Di
Haishan Ye
Yueling Zhang
Xiangyu Chang
Guang Dai
Ivor W. Tsang
348
2
0
28 May 2024
Understanding Linear Probing then Fine-tuning Language Models from NTK
  Perspective
Understanding Linear Probing then Fine-tuning Language Models from NTK Perspective
Akiyoshi Tomihari
Issei Sato
281
10
0
27 May 2024
Thinking Forward: Memory-Efficient Federated Finetuning of Language
  Models
Thinking Forward: Memory-Efficient Federated Finetuning of Language Models
Kunjal Panchal
Nisarg Parikh
Sunav Choudhary
Lijun Zhang
Yuriy Brun
Hui Guan
225
7
0
24 May 2024
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Zhe Li
Bicheng Ying
Zidong Liu
Chaosheng Dong
Haibo Yang
FedML
526
11
0
24 May 2024
Efficient Multimodal Large Language Models: A Survey
Efficient Multimodal Large Language Models: A Survey
Yizhang Jin
Jian Li
Yexin Liu
Tianjun Gu
Kai Wu
...
Xin Tan
Zhenye Gan
Yabiao Wang
Chengjie Wang
Lizhuang Ma
LRM
307
86
0
17 May 2024
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Binary Hypothesis Testing for Softmax Models and Leverage Score Models
Yeqi Gao
Yuzhou Gu
Zhao Song
412
1
0
09 May 2024
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuningInternational Conference on Machine Learning (ICML), 2024
Jing Xu
Jingzhao Zhang
227
11
0
04 May 2024
BAdam: A Memory Efficient Full Parameter Optimization Method for Large
  Language Models
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Qi Luo
Hengxu Yu
Xiao Li
264
16
0
03 Apr 2024
Test-Time Model Adaptation with Only Forward Passes
Test-Time Model Adaptation with Only Forward PassesInternational Conference on Machine Learning (ICML), 2024
Shuaicheng Niu
Chunyan Miao
Guohao Chen
Pengcheng Wu
Peilin Zhao
TTA
407
57
0
02 Apr 2024
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models BetterInternational Conference on Learning Representations (ICLR), 2024
En-hao Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Shuaiqi Wang
...
Sergey Yekhanin
Guohao Dai
Huazhong Yang
Yu Wang
Yu Wang
MoMe
430
5
0
02 Apr 2024
Heterogeneous Contrastive Learning for Foundation Models and Beyond
Heterogeneous Contrastive Learning for Foundation Models and Beyond
Lecheng Zheng
Baoyu Jing
Zihao Li
Hanghang Tong
Jingrui He
VLM
237
36
0
30 Mar 2024
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language
  Model Fine-Tuning
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
Boyao Wang
Xiang Liu
Shizhe Diao
Renjie Pi
Jipeng Zhang
Chi Han
Tong Zhang
383
93
0
26 Mar 2024
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Zeyu Han
Chao Gao
Jinyang Liu
Jeff Zhang
Sai Qian Zhang
800
707
0
21 Mar 2024
Debiased Noise Editing on Foundation Models for Fair Medical Image
  Classification
Debiased Noise Editing on Foundation Models for Fair Medical Image ClassificationInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2024
Ruinan Jin
Wenlong Deng
Minghui Chen
Xiaoxiao Li
MedIm
393
5
0
10 Mar 2024
Privacy-preserving Fine-tuning of Large Language Models through Flatness
Privacy-preserving Fine-tuning of Large Language Models through Flatness
Tiejin Chen
Longchao Da
Huixue Zhou
Pingzhi Li
Kaixiong Zhou
Tianlong Chen
Hua Wei
234
6
0
07 Mar 2024
Differentially Private Synthetic Data via Foundation Model APIs 2: Text
Differentially Private Synthetic Data via Foundation Model APIs 2: Text
Chulin Xie
Zinan Lin
A. Backurs
Sivakanth Gopi
Da Yu
...
Haotian Jiang
Huishuai Zhang
Yin Tat Lee
Yue Liu
Sergey Yekhanin
SyDa
249
59
0
04 Mar 2024
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with
  Combinatorial Optimization
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization
Xiang Meng
Shibal Ibrahim
Kayhan Behdin
Hussein Hazimeh
Natalia Ponomareva
Rahul Mazumder
VLM
339
12
0
02 Mar 2024
A Survey of Large Language Models in Cybersecurity
A Survey of Large Language Models in Cybersecurity
Gabriel de Jesus Coelho da Silva
Carlos Becker Westphall
257
14
0
26 Feb 2024
Why Transformers Need Adam: A Hessian Perspective
Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang
Congliang Chen
Tian Ding
Ziniu Li
Tian Ding
Zhimin Luo
370
76
0
26 Feb 2024
Personalized Federated Instruction Tuning via Neural Architecture Search
Personalized Federated Instruction Tuning via Neural Architecture Search
Peng Zhang
Yingbo Zhou
Ming Hu
Junxian Feng
Jiawen Weng
Xiao He
FedML
197
5
0
26 Feb 2024
Referee Can Play: An Alternative Approach to Conditional Generation via
  Model Inversion
Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion
Xuantong Liu
Tianyang Hu
Wei Cao
Kenji Kawaguchi
Xingtai Lv
DiffM
207
3
0
26 Feb 2024
Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM
  Fine-Tuning
Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning
Yong Liu
Zirui Zhu
Chaoyu Gong
Minhao Cheng
Cho-Jui Hsieh
Yang You
MoE
244
36
0
24 Feb 2024
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Yanjun Zhao
Sizhe Dang
Haishan Ye
Guang Dai
Yi Qian
Ivor W.Tsang
672
29
0
23 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Wanrong Zhu
KELMVLM
464
235
0
20 Feb 2024
GNNavi: Navigating the Information Flow in Large Language Models by
  Graph Neural Network
GNNavi: Navigating the Information Flow in Large Language Models by Graph Neural Network
Shuzhou Yuan
Ercong Nie
Michael Farber
Helmut Schmid
Hinrich Schütze
223
5
0
18 Feb 2024
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM
  Fine-Tuning: A Benchmark
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
Yihua Zhang
Pingzhi Li
Junyuan Hong
Jiaxiang Li
Yimeng Zhang
...
Wotao Yin
Mingyi Hong
Zinan Lin
Sijia Liu
Tianlong Chen
413
100
0
18 Feb 2024
LoRETTA: Low-Rank Economic Tensor-Train Adaptation for
  Ultra-Low-Parameter Fine-Tuning of Large Language Models
LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models
Yifan Yang
Jiajun Zhou
Ngai Wong
Zheng Zhang
195
16
0
18 Feb 2024
Improved Regret for Bandit Convex Optimization with Delayed Feedback
Improved Regret for Bandit Convex Optimization with Delayed Feedback
Yuanyu Wan
Chang Yao
Weilong Dai
Lijun Zhang
299
8
0
14 Feb 2024
The Mirrored Influence Hypothesis: Efficient Data Influence Estimation
  by Harnessing Forward Passes
The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes
Myeongseob Ko
Feiyang Kang
Weiyan Shi
Ming Jin
Zhou Yu
Ruoxi Jia
TDI
267
14
0
14 Feb 2024
Previous
1234
Next