v1v2v3 (latest)

Fine-Tuning Language Models with Just Forward Passes

Neural Information Processing Systems (NeurIPS), 2023

27 May 2023

ArXiv (abs)PDF HTML HuggingFace (3 upvotes)

Papers citing "Fine-Tuning Language Models with Just Forward Passes"

50 / 188 papers shown

Stochastic Subspace Descent Accelerated via Bi-fidelity Line Search

Nuojin Cheng

Alireza Doostan

Stephen Becker

359

30 Apr 2025

POPri: Private Federated Learning using Preference-Optimized Synthetic Data

529

23 Apr 2025

Efficient Model Editing with Task-Localized Sparse Fine-tuningInternational Conference on Learning Representations (ICLR), 2025

350

03 Apr 2025

A stochastic gradient descent algorithm with random search directions

Eméric Gbaguidi

ODL

246

25 Mar 2025

Efficient Personalization of Quantized Diffusion Model without BackpropagationComputer Vision and Pattern Recognition (CVPR), 2025

366

19 Mar 2025

FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization

Hao Mark Chen

S. Hu

Wayne Luk

Timothy M. Hospedales

Hongxiang Fan

MoMe

457

16 Mar 2025

ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory

360

16 Mar 2025

A Closer Look at Adversarial Suffix Learning for Jailbreaking LLMs: Augmented Adversarial Trigger LearningNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

Zhe Wang

Yanjun Qi

411

16 Mar 2025

A Survey on Federated Fine-tuning of Large Language Models

511

15 Mar 2025

Visualising Policy-Reward Interplay to Inform Zeroth-Order Preference Optimisation of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

267

05 Mar 2025

Towards hyperparameter-free optimization with differential privacyInternational Conference on Learning Representations (ICLR), 2025

Zhiqi Bu

Ruixuan Liu

254

02 Mar 2025

SubZero: Composing Subject, Style, and Action via Zero-Shot Personalization

Shubhankar Borse

K. Bhardwaj

Mohammad Reza Karimi Dastjerdi

...

440

27 Feb 2025

LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM

426

26 Feb 2025

Scalable Back-Propagation-Free Training of Optical Physics-Informed Neural Networks

360

17 Feb 2025

A Survey of Personalized Large Language Models: Progress and Future Directions

337

17 Feb 2025

An Efficient Sparse Fine-Tuning with Low Quantization Error via Neural Network Pruning

Cen-Jhih Li

Aditya Bhaskara

413

17 Feb 2025

QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models

Athanasios Mouchtaris

Ngai Wong

Zheng Zhang

308

17 Feb 2025

Efficient Zero-Order Federated Finetuning of Language Models for Resource-Constrained Devices

Mohamed Aboelenien Ahmed

527

14 Feb 2025

Model Diffusion for Certifiable Few-shot Transfer Learning

Fady Rezk

Royson Lee

Henry Gouk

Timothy M. Hospedales

Minyoung Kim

433

10 Feb 2025

Bilevel ZOFO: Efficient LLM Fine-Tuning and Meta-Training

575

05 Feb 2025

Memory-Efficient Fine-Tuning of Transformers via Token SelectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2025

427

31 Jan 2025

Decentralized Low-Rank Fine-Tuning of Large Language Models

638

26 Jan 2025

An Enhanced Zeroth-Order Stochastic Frank-Wolfe Framework for Constrained Finite-Sum Optimization

435

13 Jan 2025

Stochastic Taylor Derivative Estimator: Efficient amortization for arbitrary differential operatorsNeural Information Processing Systems (NeurIPS), 2024

1.0K

27 Nov 2024

Poor Man's Training on MCUs: A Memory-Efficient Quantized Back-Propagation-Free Approach

384

07 Nov 2024

Stepping Forward on the Last MileNeural Information Processing Systems (NeurIPS), 2024

426

06 Nov 2024

Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in TransformersNeural Information Processing Systems (NeurIPS), 2024

357

01 Nov 2024

On the Crucial Role of Initialization for Matrix FactorizationInternational Conference on Learning Representations (ICLR), 2024

414

24 Oct 2024

CKSP: Cross-species Knowledge Sharing and Preserving for Universal Animal Activity RecognitionBiosystems Engineering (Biosyst. Eng.), 2024

Meilu Zhu

144

22 Oct 2024

Understanding Forgetting in LLM Supervised Fine-Tuning and Preference Learning - A Convex Optimization Perspective

460

20 Oct 2024

Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant ProblemsNeural Information Processing Systems (NeurIPS), 2024

Bingcong Li

Liang Zhang

Niao He

284

18 Oct 2024

A Theoretical Survey on Foundation Models

Shi Fu

Yuzhu Chen

Yingjie Wang

Dacheng Tao

304

15 Oct 2024

Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning

412

14 Oct 2024

Federated Data-Efficient Instruction Tuning for Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

371

14 Oct 2024

Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models

264

13 Oct 2024

Zeroth-Order Fine-Tuning of LLMs in Random Subspaces

361

11 Oct 2024

Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language ModelsInternational Conference on Learning Representations (ICLR), 2024

Vahab Mirrokni

286

09 Oct 2024

Chemistry-Inspired Diffusion with Non-Differentiable GuidanceInternational Conference on Learning Representations (ICLR), 2024

Barnabás Póczos

352

09 Oct 2024

FLOPS: Forward Learning with OPtimal SamplingInternational Conference on Learning Representations (ICLR), 2024

Yijie Peng

442

08 Oct 2024

LoRTA: Low Rank Tensor Adaptation of Large Language Models

Ignacio Hounie

Charilaos I. Kanatsoulis

Arnuv Tandon

Alejandro Ribeiro

487

05 Oct 2024

Variance-Reduced Gradient Estimator for Nonconvex Zeroth-Order Distributed OptimizationAmerican Control Conference (ACC), 2024

Huaiyi Mu

Yujie Tang

Zhongkui Li

104

29 Sep 2024

Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward InferenceInternational Conference on Learning Representations (ICLR), 2024

Qining Zhang

Lei Ying

OffRL

475

25 Sep 2024

Communication and Energy Efficient Federated Learning using Zero-Order Optimization TechniqueIEEE Transactions on Signal Processing (IEEE TSP), 2024

Elissa Mhanna

Mohamad Assaad

202

24 Sep 2024

MobiZO: Enabling Efficient LLM Fine-Tuning at the Edge via Inference Engines

278

23 Sep 2024

A Unified Causal Framework for Auditing Recommender Systems for Ethical Concerns

Zachary C. Lipton

167

20 Sep 2024

Self-Contrastive Forward-Forward AlgorithmNature Communications (Nat. Commun.), 2024

587

17 Sep 2024

Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models

Yao Shu

Wenyang Hu

Szu Hui Ng

Bryan Kian Hsiang Low

Fei Richard Yu

FedML

455

10 Sep 2024

Scalable Multitask Learning Using Gradient-based Estimation of Task AffinityKnowledge Discovery and Data Mining (KDD), 2024

Dongyue Li

Aneesh Sharma

Hongyang R. Zhang

334

09 Sep 2024

Towards training digitally-tied analog blocks via hybrid gradient computationNeural Information Processing Systems (NeurIPS), 2024

Timothy Nest

M. Ernoult

287

05 Sep 2024

Towards General Industrial Intelligence: A Survey on IIoT-Enhanced Continual Large Models

284

02 Sep 2024