ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.06675
  4. Cited By
Symbolic Discovery of Optimization Algorithms

Symbolic Discovery of Optimization Algorithms

13 February 2023
Xiangning Chen
Chen Liang
Da Huang
Esteban Real
Kaiyuan Wang
Yao Liu
Hieu H. Pham
Xuanyi Dong
Thang Luong
Cho-Jui Hsieh
Yifeng Lu
Quoc V. Le
ArXivPDFHTML

Papers citing "Symbolic Discovery of Optimization Algorithms"

50 / 194 papers shown
Title
ReactDance: Progressive-Granular Representation for Long-Term Coherent Reactive Dance Generation
ReactDance: Progressive-Granular Representation for Long-Term Coherent Reactive Dance Generation
Jingzhong Lin
Yuanyuan Qi
Xinru Li
Wenxuan Huang
Xiangfeng Xu
Bangyan Li
Xuejiao Wang
Gaoqi He
29
0
0
08 May 2025
Binding threshold units with artificial oscillatory neurons
Binding threshold units with artificial oscillatory neurons
V. Fanaskov
Ivan V. Oseledets
47
0
0
06 May 2025
Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics
Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics
Cong Xu
Wenbin Liang
Mo Yu
Anan Liu
K. Zhang
Lizhuang Ma
J. Wang
J. Wang
W. Zhang
MQ
51
0
0
01 May 2025
DualOptim: Enhancing Efficacy and Stability in Machine Unlearning with Dual Optimizers
DualOptim: Enhancing Efficacy and Stability in Machine Unlearning with Dual Optimizers
Xuyang Zhong
Haochen Luo
Chen Liu
MU
25
0
0
22 Apr 2025
AlphaGrad: Non-Linear Gradient Normalization Optimizer
AlphaGrad: Non-Linear Gradient Normalization Optimizer
Soham Sane
ODL
46
0
0
22 Apr 2025
Mitigating Spectral Bias in Neural Operators via High-Frequency Scaling for Physical Systems
Mitigating Spectral Bias in Neural Operators via High-Frequency Scaling for Physical Systems
Siavash Khodakarami
Vivek Oommen
Aniruddha Bora
George Karniadakis
AI4CE
60
1
0
17 Mar 2025
Detection Avoidance Techniques for Large Language Models
Sinclair Schneider
Florian Steuber
João A. G. Schneider
Gabi Dreo Rodosek
DeLMO
78
0
0
10 Mar 2025
When Can You Get Away with Low Memory Adam?
When Can You Get Away with Low Memory Adam?
Dayal Singh Kalra
John Kirchenbauer
M. Barkeshli
Tom Goldstein
69
0
0
03 Mar 2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
Jinbo Wang
Mingze Wang
Zhanpeng Zhou
Junchi Yan
Weinan E
Lei Wu
75
1
0
26 Feb 2025
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
Liming Liu
Zhenghao Xu
Zixuan Zhang
Hao Kang
Zichong Li
Chen Liang
Weizhu Chen
T. Zhao
90
1
0
24 Feb 2025
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam
Tianjin Huang
Haotian Hu
Zhenyu (Allen) Zhang
Gaojie Jin
X. Li
...
Tianlong Chen
Lu Liu
Qingsong Wen
Zhangyang Wang
Shiwei Liu
MQ
35
0
0
24 Feb 2025
DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications
DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications
Ibrahim Fayad
Max Zimmer
Martin Schwartz
P. Ciais
Fabian Gieseke
Gabriel Belouze
Sarah Brood
A. D. Truchis
Alexandre d’Aspremont
AI4TS
38
0
0
24 Feb 2025
Preconditioned Inexact Stochastic ADMM for Deep Model
Shenglong Zhou
Ouya Wang
Ziyan Luo
Yongxu Zhu
Geoffrey Ye Li
36
0
0
15 Feb 2025
TLOB: A Novel Transformer Model with Dual Attention for Price Trend Prediction with Limit Order Book Data
TLOB: A Novel Transformer Model with Dual Attention for Price Trend Prediction with Limit Order Book Data
Leonardo Berti
Gjergji Kasneci
AI4TS
40
0
0
12 Feb 2025
Spectral-factorized Positive-definite Curvature Learning for NN Training
Spectral-factorized Positive-definite Curvature Learning for NN Training
Wu Lin
Felix Dangel
Runa Eschenhagen
Juhan Bae
Richard E. Turner
Roger B. Grosse
45
0
0
10 Feb 2025
Gradient Multi-Normalization for Stateless and Scalable LLM Training
Gradient Multi-Normalization for Stateless and Scalable LLM Training
M. Scetbon
Chao Ma
Wenbo Gong
Edward Meeds
97
1
0
10 Feb 2025
Understanding Why Adam Outperforms SGD: Gradient Heterogeneity in Transformers
Understanding Why Adam Outperforms SGD: Gradient Heterogeneity in Transformers
Akiyoshi Tomihari
Issei Sato
ODL
59
0
0
31 Jan 2025
Physics of Skill Learning
Physics of Skill Learning
Ziming Liu
Yizhou Liu
Eric J. Michaud
Jeff Gore
Max Tegmark
44
1
0
21 Jan 2025
FOCUS: First Order Concentrated Updating Scheme
FOCUS: First Order Concentrated Updating Scheme
Yizhou Liu
Ziming Liu
Jeff Gore
ODL
104
1
0
21 Jan 2025
A Survey on Memory-Efficient Large-Scale Model Training in AI for Science
A Survey on Memory-Efficient Large-Scale Model Training in AI for Science
Kaiyuan Tian
Linbo Qiao
Baihui Liu
Gongqingjian Jiang
Dongsheng Li
31
0
0
21 Jan 2025
3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results
3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results
Benjamin Kiefer
Lojze Žust
Jon Muhovič
Matej Kristan
J. Pers
...
Ashraf Saleem
Ching-Heng Cheng
Yu-Fan Lin
Tzu-Yu Lin
Chih-Chung Hsu
43
0
0
20 Jan 2025
Towards Mitigating Architecture Overfitting on Distilled Datasets
Towards Mitigating Architecture Overfitting on Distilled Datasets
Xuyang Zhong
Chen Liu
DD
55
0
0
08 Jan 2025
Grams: Gradient Descent with Adaptive Momentum Scaling
Grams: Gradient Descent with Adaptive Momentum Scaling
Yang Cao
Xiaoyu Li
Zhao-quan Song
ODL
83
2
0
22 Dec 2024
Navigating limitations with precision: A fine-grained ensemble approach
  to wrist pathology recognition on a limited x-ray dataset
Navigating limitations with precision: A fine-grained ensemble approach to wrist pathology recognition on a limited x-ray dataset
Ammar Ahmed
Ali Shariq Imran
M. Ullah
Zenun Kastrati
Sher Muhammad Daudpota
79
0
0
18 Dec 2024
Distributed Sign Momentum with Local Steps for Training Transformers
Distributed Sign Momentum with Local Steps for Training Transformers
Shuhua Yu
Ding Zhou
Cong Xie
An Xu
Zhi-Li Zhang
Xin Liu
S. Kar
64
0
0
26 Nov 2024
Lion Cub: Minimizing Communication Overhead in Distributed Lion
Lion Cub: Minimizing Communication Overhead in Distributed Lion
Satoki Ishikawa
Tal Ben-Nun
B. Van Essen
Rio Yokota
Nikoli Dryden
69
0
0
25 Nov 2024
Cautious Optimizers: Improving Training with One Line of Code
Cautious Optimizers: Improving Training with One Line of Code
Kaizhao Liang
Lizhang Chen
B. Liu
Qiang Liu
ODL
98
5
0
25 Nov 2024
Disentangling the Complex Multiplexed DIA Spectra in De Novo Peptide Sequencing
Disentangling the Complex Multiplexed DIA Spectra in De Novo Peptide Sequencing
Zheng Ma
Zeping Mao
Ruixue Zhang
Jiazhen Chen
L. Xin
Paul Shan
A. Ghodsi
Ming Li
75
0
0
24 Nov 2024
Deep Feature Response Discriminative Calibration
Deep Feature Response Discriminative Calibration
Wenxiang Xu
Tian Qiu
Linyun Zhou
Zunlei Feng
Mingli Song
Huiqiong Wang
72
0
0
16 Nov 2024
NeuralDEM -- Real-time Simulation of Industrial Particulate Flows
NeuralDEM -- Real-time Simulation of Industrial Particulate Flows
Benedikt Alkin
Tobias Kronlachner
Samuele Papa
Stefan Pirker
Thomas Lichtenegger
Johannes Brandstetter
PINN
AI4CE
38
1
1
14 Nov 2024
Convergence Rate Analysis of LION
Convergence Rate Analysis of LION
Yiming Dong
Huan Li
Zhouchen Lin
35
0
0
12 Nov 2024
ViTally Consistent: Scaling Biological Representation Learning for Cell
  Microscopy
ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy
Kian Kenyon-Dean
Zitong Jerry Wang
John Urbanik
Konstantin Donhauser
Jason Hartford
...
Safiye Celik
Marta Fay
Juan Sebastian Rodriguez Vera
I. Haque
Oren Z. Kraus
MedIm
35
4
0
04 Nov 2024
A Lorentz-Equivariant Transformer for All of the LHC
A Lorentz-Equivariant Transformer for All of the LHC
Johann Brehmer
Victor Bresó
P. D. Haan
Tilman Plehn
Huilin Qu
Jonas Spinner
Jesse Thaler
BDL
28
10
0
01 Nov 2024
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
Atli Kosson
Bettina Messmer
Martin Jaggi
AI4CE
18
2
0
31 Oct 2024
MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of
  Low-rank Experts
MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
Jie Zhu
Y. Chen
Mingyu Ding
Ping Luo
Leye Wang
Jingdong Wang
DiffM
34
2
0
30 Oct 2024
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA
  Optimization
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
Jui-Nan Yen
Si Si
Zhao Meng
Felix X. Yu
Sai Surya Duvvuri
Inderjit Dhillon
Cho-Jui Hsieh
Sanjiv Kumar
27
1
0
27 Oct 2024
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
Haocheng Xi
Han Cai
Ligeng Zhu
Y. Lu
Kurt Keutzer
Jianfei Chen
Song Han
MQ
63
9
0
25 Oct 2024
Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation
  Learning
Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning
Siyuan Li
Juanxi Tian
Zedong Wang
Luyuan Zhang
Zicheng Liu
Weiyang Jin
Yang Liu
Baigui Sun
Stan Z. Li
29
0
0
08 Oct 2024
A second-order-like optimizer with adaptive gradient scaling for deep
  learning
A second-order-like optimizer with adaptive gradient scaling for deep learning
Jérôme Bolte
Ryan Boustany
Edouard Pauwels
Andrei Purica
ODL
30
0
0
08 Oct 2024
Comparison of Autoencoder Encodings for ECG Representation in Downstream
  Prediction Tasks
Comparison of Autoencoder Encodings for ECG Representation in Downstream Prediction Tasks
Christopher J. Harvey
Sumaiya Shomaji
Zijun Yao
Amit Noheria
24
1
0
03 Oct 2024
Old Optimizer, New Norm: An Anthology
Old Optimizer, New Norm: An Anthology
Jeremy Bernstein
Laker Newhouse
ODL
36
12
0
30 Sep 2024
Faster Acceleration for Steepest Descent
Faster Acceleration for Steepest Descent
Site Bai
Brian Bullins
ODL
26
0
0
28 Sep 2024
NTIRE 2024 Challenge on Stereo Image Super-Resolution: Methods and
  Results
NTIRE 2024 Challenge on Stereo Image Super-Resolution: Methods and Results
Longguang Wang
Yulan Guo
Juncheng Li
Hongda Liu
Yang Zhao
Yingqian Wang
Zhi Jin
Shuhang Gu
Radu Timofte
SupR
72
17
0
25 Sep 2024
SOAP: Improving and Stabilizing Shampoo using Adam
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas
Depen Morwani
Rosie Zhao
Itai Shapira
David Brandfonbrener
Lucas Janson
Sham Kakade
Sham Kakade
59
23
0
17 Sep 2024
A framework for measuring the training efficiency of a neural
  architecture
A framework for measuring the training efficiency of a neural architecture
Eduardo Cueto-Mendoza
John D. Kelleher
38
0
0
12 Sep 2024
The AdEMAMix Optimizer: Better, Faster, Older
The AdEMAMix Optimizer: Better, Faster, Older
Matteo Pagliardini
Pierre Ablin
David Grangier
ODL
28
8
0
05 Sep 2024
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring
  Expression Segmentation
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation
Yi-Chia Chen
Wei-Hua Li
Cheng Sun
Yu-Chiang Frank Wang
Chu-Song Chen
VLM
35
10
0
01 Sep 2024
HPT++: Hierarchically Prompting Vision-Language Models with
  Multi-Granularity Knowledge Generation and Improved Structure Modeling
HPT++: Hierarchically Prompting Vision-Language Models with Multi-Granularity Knowledge Generation and Improved Structure Modeling
Yubin Wang
Xinyang Jiang
De Cheng
Wenli Sun
Dongsheng Li
Cairong Zhao
VLM
40
0
0
27 Aug 2024
Memory-Efficient LLM Training with Online Subspace Descent
Memory-Efficient LLM Training with Online Subspace Descent
Kaizhao Liang
Bo Liu
Lizhang Chen
Qiang Liu
24
7
0
23 Aug 2024
Narrowing the Focus: Learned Optimizers for Pretrained Models
Narrowing the Focus: Learned Optimizers for Pretrained Models
Gus Kristiansen
Mark Sandler
A. Zhmoginov
Nolan Miller
Anirudh Goyal
Jihwan Lee
Max Vladymyrov
34
1
0
17 Aug 2024
1234
Next