ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2208.06677
  4. Cited By
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep
  Models

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

13 August 2022
Xingyu Xie
Pan Zhou
Huan Li
Zhouchen Lin
Shuicheng Yan
    ODL
ArXivPDFHTML

Papers citing "Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models"

26 / 26 papers shown
Title
DISCO: learning to DISCover an evolution Operator for multi-physics-agnostic prediction
DISCO: learning to DISCover an evolution Operator for multi-physics-agnostic prediction
Rudy Morel
Jiequn Han
Edouard Oyallon
AI4CE
51
0
0
28 Apr 2025
Tree-NeRV: A Tree-Structured Neural Representation for Efficient Non-Uniform Video Encoding
Tree-NeRV: A Tree-Structured Neural Representation for Efficient Non-Uniform Video Encoding
Jiancheng Zhao
Yifan Zhan
Qingtian Zhu
Mingze Ma
Muyao Niu
Zunian Wan
Xiang Ji
Yinqiang Zheng
24
0
0
17 Apr 2025
FANeRV: Frequency Separation and Augmentation based Neural Representation for Video
FANeRV: Frequency Separation and Augmentation based Neural Representation for Video
Li Yu
Zhihui Li
Chao Yao
Jimin Xiao
M. Gabbouj
30
0
0
09 Apr 2025
DRoPE: Directional Rotary Position Embedding for Efficient Agent Interaction Modeling
DRoPE: Directional Rotary Position Embedding for Efficient Agent Interaction Modeling
Jianbo Zhao
Taiyu Ban
Zhihao Liu
Hangning Zhou
Xiyang Wang
Qibin Zhou
Hailong Qin
Mu Yang
Lei Liu
Bin Li
60
0
0
19 Mar 2025
Striving for Simplicity: Simple Yet Effective Prior-Aware Pseudo-Labeling for Semi-Supervised Ultrasound Image Segmentation
Striving for Simplicity: Simple Yet Effective Prior-Aware Pseudo-Labeling for Semi-Supervised Ultrasound Image Segmentation
Yaxiong Chen
Yujie Wang
Zixuan Zheng
Jingliang Hu
Yilei Shi
Shengwu Xiong
Xiao Xiang Zhu
Lichao Mou
52
0
0
18 Mar 2025
Grams: Gradient Descent with Adaptive Momentum Scaling
Grams: Gradient Descent with Adaptive Momentum Scaling
Yang Cao
Xiaoyu Li
Zhao-quan Song
ODL
83
2
0
22 Dec 2024
Cautious Optimizers: Improving Training with One Line of Code
Cautious Optimizers: Improving Training with One Line of Code
Kaizhao Liang
Lizhang Chen
B. Liu
Qiang Liu
ODL
98
5
0
25 Nov 2024
RedPajama: an Open Dataset for Training Large Language Models
RedPajama: an Open Dataset for Training Large Language Models
Maurice Weber
Daniel Y. Fu
Quentin Anthony
Yonatan Oren
S. Adams
...
Tri Dao
Percy Liang
Christopher Ré
Irina Rish
Ce Zhang
98
52
0
19 Nov 2024
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation
T. Pham
Tri Ton
Chang D. Yoo
36
3
0
03 Oct 2024
Imaging foundation model for universal enhancement of non-ideal measurement CT
Imaging foundation model for universal enhancement of non-ideal measurement CT
Yuxin Liu
Rongjun Ge
Yuting He
Zhan Wu
Chenyu You
Yuan Gao
Chenyu You
Ge Wang
Yang Chen
Shuo Li
MedIm
21
2
0
02 Oct 2024
Tamper-Resistant Safeguards for Open-Weight LLMs
Tamper-Resistant Safeguards for Open-Weight LLMs
Rishub Tamirisa
Bhrugu Bharathi
Long Phan
Andy Zhou
Alice Gatti
...
Andy Zou
Dawn Song
Bo Li
Dan Hendrycks
Mantas Mazeika
AAML
MU
47
37
0
01 Aug 2024
Accelerated Stochastic Min-Max Optimization Based on Bias-corrected Momentum
Accelerated Stochastic Min-Max Optimization Based on Bias-corrected Momentum
H. Cai
Sulaiman A. Alghunaim
Ali H.Sayed
41
1
0
18 Jun 2024
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D
  Data
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data
Qihao Liu
Yi Zhang
Song Bai
Adam Kortylewski
Alan Yuille
34
9
0
06 Jun 2024
PAON: A New Neuron Model using Padé Approximants
PAON: A New Neuron Model using Padé Approximants
Onur Keleş
A. Murat Tekalp
24
1
0
18 Mar 2024
Boosting Neural Representations for Videos with a Conditional Decoder
Boosting Neural Representations for Videos with a Conditional Decoder
Xinjie Zhang
Ren Yang
Dailan He
Xingtong Ge
Tongda Xu
Yan Wang
Hongwei Qin
Jun Zhang
32
15
0
28 Feb 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
24
13
0
08 Feb 2024
Lightweight High-Speed Photography Built on Coded Exposure and Implicit
  Neural Representation of Videos
Lightweight High-Speed Photography Built on Coded Exposure and Implicit Neural Representation of Videos
Zhihong Zhang
Runzhao Yang
J. Suo
Yuxiao Cheng
Qionghai Dai
25
0
0
22 Nov 2023
Collaborative Score Distillation for Consistent Visual Synthesis
Collaborative Score Distillation for Consistent Visual Synthesis
Subin Kim
Kyungmin Lee
June Suk Choi
Jongheon Jeong
Kihyuk Sohn
Jinwoo Shin
DiffM
19
21
0
04 Jul 2023
Two Sides of One Coin: the Limits of Untuned SGD and the Power of
  Adaptive Methods
Two Sides of One Coin: the Limits of Untuned SGD and the Power of Adaptive Methods
Junchi Yang
Xiang Li
Ilyas Fatkhullin
Niao He
29
15
0
21 May 2023
DNeRV: Modeling Inherent Dynamics via Difference Neural Representation
  for Videos
DNeRV: Modeling Inherent Dynamics via Difference Neural Representation for Videos
Qi Zhao
M. Salman Asif
Zhan Ma
13
31
0
13 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
24
39
0
07 Apr 2023
Understanding Imbalanced Semantic Segmentation Through Neural Collapse
Understanding Imbalanced Semantic Segmentation Through Neural Collapse
Zhisheng Zhong
Jiequan Cui
Yibo Yang
Xiaoyang Wu
Xiaojuan Qi
X. Zhang
Jiaya Jia
124
45
0
03 Jan 2023
Restarted Nonconvex Accelerated Gradient Descent: No More
  Polylogarithmic Factor in the $O(ε^{-7/4})$ Complexity
Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the O(ε−7/4)O(ε^{-7/4})O(ε−7/4) Complexity
Huan Li
Zhouchen Lin
37
21
0
27 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
Efficient Sharpness-aware Minimization for Improved Training of Neural
  Networks
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks
Jiawei Du
Hanshu Yan
Jiashi Feng
Joey Tianyi Zhou
Liangli Zhen
Rick Siow Mong Goh
Vincent Y. F. Tan
AAML
102
132
0
07 Oct 2021
ResNet strikes back: An improved training procedure in timm
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
207
484
0
01 Oct 2021
1