ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.15184
  4. Cited By
ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and
  Uncertainty in Zeroth-order Optimization

ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-order Optimization

23 December 2023
Shuoran Jiang
Qingcai Chen
Youcheng Pan
Yang Xiang
Yukang Lin
Xiangping Wu
Chuanyi Liu
Xiaobao Song
    ODL
ArXivPDFHTML

Papers citing "ZO-AdaMU Optimizer: Adapting Perturbation by the Momentum and Uncertainty in Zeroth-order Optimization"

6 / 6 papers shown
Title
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
Yanjun Zhao
Sizhe Dang
Haishan Ye
Guang Dai
Yi Qian
Ivor W.Tsang
58
8
0
23 Feb 2024
An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP
  Tasks
An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks
Yuxiang Wu
Yu Zhao
Baotian Hu
Pasquale Minervini
Pontus Stenetorp
Sebastian Riedel
RALM
KELM
43
42
0
30 Oct 2022
A Kernel-Based View of Language Model Fine-Tuning
A Kernel-Based View of Language Model Fine-Tuning
Sadhika Malladi
Alexander Wettig
Dingli Yu
Danqi Chen
Sanjeev Arora
VLM
63
60
0
11 Oct 2022
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
238
1,898
0
31 Dec 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural
  Language Inference
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
248
1,382
0
21 Jan 2020
Stochastic Gradient Descent for Non-smooth Optimization: Convergence
  Results and Optimal Averaging Schemes
Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes
Ohad Shamir
Tong Zhang
99
570
0
08 Dec 2012
1