Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.06675
Cited By
Symbolic Discovery of Optimization Algorithms
13 February 2023
Xiangning Chen
Chen Liang
Da Huang
Esteban Real
Kaiyuan Wang
Yao Liu
Hieu H. Pham
Xuanyi Dong
Thang Luong
Cho-Jui Hsieh
Yifeng Lu
Quoc V. Le
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Symbolic Discovery of Optimization Algorithms"
44 / 194 papers shown
Title
Unlocking Accuracy and Fairness in Differentially Private Image Classification
Leonard Berrada
Soham De
J. Shen
Jamie Hayes
Robert Stanforth
David Stutz
Pushmeet Kohli
Samuel L. Smith
Borja Balle
19
13
0
21 Aug 2023
LMTuner: An user-friendly and highly-integrable Training Framework for fine-tuning Large Language Models
Yixuan Weng
Zhiqi Wang
Huanxuan Liao
Shizhu He
Shengping Liu
Kang Liu
Jun Zhao
26
3
0
20 Aug 2023
ChatEDA: A Large Language Model Powered Autonomous Agent for EDA
Zhuolun He
Haoyuan Wu
Xinyun Zhang
Xufeng Yao
Su Zheng
Haisheng Zheng
Bei Yu
LLMAG
19
100
0
20 Aug 2023
Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent
Ruichong Zhang
23
0
0
13 Jul 2023
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
13
41
0
12 Jul 2023
Steel Surface Roughness Parameter Calculations Using Lasers and Machine Learning Models
A. Milne
Xianghua Xie
AI4CE
19
0
0
06 Jul 2023
Scaling MLPs: A Tale of Inductive Bias
Gregor Bachmann
Sotiris Anagnostidis
Thomas Hofmann
32
38
0
23 Jun 2023
When and Why Momentum Accelerates SGD:An Empirical Study
Jingwen Fu
Bohan Wang
Huishuai Zhang
Zhizheng Zhang
Wei Chen
Na Zheng
12
10
0
15 Jun 2023
Rethinking pose estimation in crowds: overcoming the detection information-bottleneck and ambiguity
Mu Zhou
Lucas Stoffl
Mackenzie W. Mathis
Alexander Mathis
VOT
8
16
0
13 Jun 2023
Fast light-field 3D microscopy with out-of-distribution detection and adaptation through Conditional Normalizing Flows
Josué Page Vizcaíno
Panagiotis Symvoulidis
Zeguan Wang
Jonas Jelten
Paolo Favaro
Ed Boyden
Tobias Lasser
17
1
0
10 Jun 2023
Normalization Layers Are All That Sharpness-Aware Minimization Needs
Maximilian Mueller
Tiffany J. Vlaar
David Rolnick
Matthias Hein
10
18
0
07 Jun 2023
White-Box Transformers via Sparse Rate Reduction
Yaodong Yu
Sam Buchanan
Druv Pai
Tianzhe Chu
Ziyang Wu
Shengbang Tong
B. Haeffele
Y. Ma
ViT
16
80
0
01 Jun 2023
Improving Energy Conserving Descent for Machine Learning: Theory and Practice
G. Luca
Alice Gatti
E. Silverstein
10
1
0
01 Jun 2023
Toward Understanding Why Adam Converges Faster Than SGD for Transformers
Yan Pan
Yuanzhi Li
20
40
0
31 May 2023
Mechanic: A Learning Rate Tuner
Ashok Cutkosky
Aaron Defazio
Harsh Mehta
OffRL
16
15
0
31 May 2023
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
Atli Kosson
Bettina Messmer
Martin Jaggi
22
11
0
26 May 2023
XGrad: Boosting Gradient-Based Optimizers With Weight Prediction
Lei Guan
Dongsheng Li
Yanqi Shi
Jian Meng
ODL
20
2
0
26 May 2023
HARD: Hard Augmentations for Robust Distillation
Arne F. Nix
Max F. Burg
Fabian H. Sinz
AAML
23
1
0
24 May 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
13
128
0
23 May 2023
Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models
Antoni Bigata Casademunt
Rodrigo Mira
Nikita Drobyshev
Konstantinos Vougioukas
Stavros Petridis
M. Pantic
DiffM
59
2
0
15 May 2023
MoMo: Momentum Models for Adaptive Learning Rates
Fabian Schaipp
Ruben Ohana
Michael Eickenberg
Aaron Defazio
Robert Mansel Gower
27
10
0
12 May 2023
Less is More: Removing Text-regions Improves CLIP Training Efficiency and Robustness
Liangliang Cao
Bowen Zhang
Chen Chen
Yinfei Yang
Xianzhi Du
Wen‐Cheng Zhang
Zhiyun Lu
Yantao Zheng
CLIP
VLM
19
15
0
08 May 2023
Joint Moment Retrieval and Highlight Detection Via Natural Language Queries
Richard Luo
Austin Peng
Heidi Yap
Koby Beard
ViT
16
0
0
08 May 2023
Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be
Frederik Kunstner
Jacques Chen
J. Lavington
Mark W. Schmidt
40
67
0
27 Apr 2023
Stable and low-precision training for large-scale vision-language models
Mitchell Wortsman
Tim Dettmers
Luke Zettlemoyer
Ari S. Morcos
Ali Farhadi
Ludwig Schmidt
MQ
MLLM
VLM
22
38
0
25 Apr 2023
The Case for Hierarchical Deep Learning Inference at the Network Edge
Ghina Al-Atat
Andrea Fresa
Adarsh Prasad Behera
Vishnu Narayanan Moothedath
James Gross
J. Champati
27
8
0
23 Apr 2023
Refusion: Enabling Large-Size Realistic Image Restoration with Latent-Space Diffusion Models
Ziwei Luo
Fredrik K. Gustafsson
Zhengli Zhao
Jens Sjölund
Thomas B. Schon
30
100
0
17 Apr 2023
Surveillance Face Presentation Attack Detection Challenge
Hao Fang
Ajian Liu
Jun Wan
Sergio Escalera
Hugo Jair Escalante
Zhen Lei
CVBM
AAML
21
12
0
15 Apr 2023
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
60
3,011
0
14 Apr 2023
Cross-View Hierarchy Network for Stereo Image Super-Resolution
Wenbin Zou
Hong-xia Gao
Liang Chen
Yunchen Zhang
Mingchao Jiang
Zhongxin Yu
Ming Tan
SupR
29
11
0
13 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
24
39
0
07 Apr 2023
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIP
VLM
19
931
0
27 Mar 2023
AI-in-the-Loop -- The impact of HMI in AI-based Application
Julius Schöning
C. Westerkamp
8
4
0
21 Mar 2023
EVA-02: A Visual Representation for Neon Genesis
Yuxin Fang
Quan-Sen Sun
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
ViT
CLIP
38
259
0
20 Mar 2023
Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning
Wu Lin
Valentin Duruisseaux
Melvin Leok
Frank Nielsen
Mohammad Emtiyaz Khan
Mark W. Schmidt
33
7
0
20 Feb 2023
Fixing Overconfidence in Dynamic Neural Networks
Lassi Meronen
Martin Trapp
Andrea Pilzer
Le Yang
Arno Solin
BDL
21
16
0
13 Feb 2023
Mnemosyne: Learning to Train Transformers with Transformers
Deepali Jain
K. Choromanski
Kumar Avinava Dubey
Sumeet Singh
Vikas Sindhwani
Tingnan Zhang
Jie Tan
OffRL
31
9
0
02 Feb 2023
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
18
47
0
02 Feb 2023
AutoOpt: A General Framework for Automatically Designing Metaheuristic Optimization Algorithms with Diverse Structures
Qi Zhao
Bai Yan
Xianglong Chen
Taiwei Hu
Shi Cheng
Yuhui Shi
12
25
0
03 Apr 2022
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
71
222
0
21 Feb 2022
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
73
129
0
16 Oct 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
264
5,326
0
05 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,886
0
15 Sep 2016
Previous
1
2
3
4