ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.05898
  4. Cited By
Lion Secretly Solves Constrained Optimization: As Lyapunov Predicts

Lion Secretly Solves Constrained Optimization: As Lyapunov Predicts

9 October 2023
Lizhang Chen
Bo Liu
Kaizhao Liang
Qian Liu
    ODL
ArXivPDFHTML

Papers citing "Lion Secretly Solves Constrained Optimization: As Lyapunov Predicts"

13 / 13 papers shown
Title
Memory-Efficient LLM Training by Various-Grained Low-Rank Projection of Gradients
Memory-Efficient LLM Training by Various-Grained Low-Rank Projection of Gradients
Yezhen Wang
Zhouhao Yang
Brian K Chen
Fanyi Pu
Bo-wen Li
Tianyu Gao
Kenji Kawaguchi
34
0
0
03 May 2025
Understanding Why Adam Outperforms SGD: Gradient Heterogeneity in Transformers
Understanding Why Adam Outperforms SGD: Gradient Heterogeneity in Transformers
Akiyoshi Tomihari
Issei Sato
ODL
59
0
0
31 Jan 2025
Distributed Sign Momentum with Local Steps for Training Transformers
Distributed Sign Momentum with Local Steps for Training Transformers
Shuhua Yu
Ding Zhou
Cong Xie
An Xu
Zhi-Li Zhang
Xin Liu
S. Kar
59
0
0
26 Nov 2024
Cautious Optimizers: Improving Training with One Line of Code
Cautious Optimizers: Improving Training with One Line of Code
Kaizhao Liang
Lizhang Chen
B. Liu
Qiang Liu
ODL
95
5
0
25 Nov 2024
Convergence Rate Analysis of LION
Convergence Rate Analysis of LION
Yiming Dong
Huan Li
Zhouchen Lin
25
0
0
12 Nov 2024
A second-order-like optimizer with adaptive gradient scaling for deep
  learning
A second-order-like optimizer with adaptive gradient scaling for deep learning
Jérôme Bolte
Ryan Boustany
Edouard Pauwels
Andrei Purica
ODL
20
0
0
08 Oct 2024
Memory-Efficient LLM Training with Online Subspace Descent
Memory-Efficient LLM Training with Online Subspace Descent
Kaizhao Liang
Bo Liu
Lizhang Chen
Qiang Liu
13
7
0
23 Aug 2024
The Implicit Bias of Adam on Separable Data
The Implicit Bias of Adam on Separable Data
Chenyang Zhang
Difan Zou
Yuan Cao
AI4CE
35
7
0
15 Jun 2024
H-Fac: Memory-Efficient Optimization with Factorized Hamiltonian Descent
H-Fac: Memory-Efficient Optimization with Factorized Hamiltonian Descent
Son Nguyen
Lizhang Chen
Bo Liu
Qiang Liu
20
3
0
14 Jun 2024
Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization
Implicit Bias of AdamW: ℓ∞\ell_\inftyℓ∞​ Norm Constrained Optimization
Shuo Xie
Zhiyuan Li
OffRL
20
12
0
05 Apr 2024
Communication Efficient Distributed Training with Distributed Lion
Communication Efficient Distributed Training with Distributed Lion
Bo Liu
Lemeng Wu
Lizhang Chen
Kaizhao Liang
Jiaxu Zhu
Chen Liang
Raghuraman Krishnamoorthi
Qiang Liu
29
6
0
30 Mar 2024
Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on
  Transformers, but Sign Descent Might Be
Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be
Frederik Kunstner
Jacques Chen
J. Lavington
Mark W. Schmidt
38
66
0
27 Apr 2023
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
279
39,083
0
01 Sep 2014
1