Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.08292
Cited By
The Marginal Value of Adaptive Gradient Methods in Machine Learning
23 May 2017
Ashia C. Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Marginal Value of Adaptive Gradient Methods in Machine Learning"
50 / 127 papers shown
Title
AlphaGrad: Non-Linear Gradient Normalization Optimizer
Soham Sane
ODL
56
0
0
22 Apr 2025
Mixed-State Quantum Denoising Diffusion Probabilistic Model
Gino Kwun
Bingzhi Zhang
Quntao Zhuang
DiffM
89
1
0
26 Nov 2024
Can Learned Optimization Make Reinforcement Learning Less Difficult?
Alexander David Goldie
Chris Xiaoxuan Lu
Matthew Jackson
Shimon Whiteson
Jakob N. Foerster
42
3
0
09 Jul 2024
Gated recurrent neural network with TPE Bayesian optimization for enhancing stock index prediction accuracy
B. Dinda
AIFin
23
0
0
02 Jun 2024
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Damien Martins Gomes
Yanlei Zhang
Eugene Belilovsky
Guy Wolf
Mahdi S. Hosseini
ODL
76
2
0
26 May 2024
Implicit Bias of AdamW:
ℓ
∞
\ell_\infty
ℓ
∞
Norm Constrained Optimization
Shuo Xie
Zhiyuan Li
OffRL
44
12
0
05 Apr 2024
Fusion Transformer with Object Mask Guidance for Image Forgery Analysis
Dimitrios Karageorgiou
Giorgos Kordopatis-Zilos
Symeon Papadopoulos
ViT
22
5
0
18 Mar 2024
Optimizing Neural Networks with Gradient Lexicase Selection
Lijie Ding
Lee Spector
40
20
0
19 Dec 2023
Principled Weight Initialization for Hypernetworks
Oscar Chang
Lampros Flokas
Hod Lipson
22
73
0
13 Dec 2023
Distributed Extra-gradient with Optimal Complexity and Communication Guarantees
Ali Ramezani-Kebrya
Kimon Antonakopoulos
Igor Krawczuk
Justin Deschenaux
V. Cevher
36
2
0
17 Aug 2023
Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods
A. Ma
Yangchen Pan
Amir-massoud Farahmand
AAML
25
5
0
13 Aug 2023
Benchmarking Deep Learning Frameworks for Automated Diagnosis of Ocular Toxoplasmosis: A Comprehensive Approach to Classification and Segmentation
Syed Samiul Alam
Samiul Based Shuvo
Shams Nafisa Ali
F. Ahmed
Arbil Chakma
Yeonggul Jang
12
5
0
18 May 2023
Stability and Convergence of Distributed Stochastic Approximations with large Unbounded Stochastic Information Delays
Adrian Redder
Arunselvan Ramaswamy
Holger Karl
15
1
0
11 May 2023
Mathematical Challenges in Deep Learning
V. Nia
Guojun Zhang
I. Kobyzev
Michael R. Metel
Xinlin Li
...
S. Hemati
M. Asgharian
Linglong Kong
Wulong Liu
Boxing Chen
AI4CE
VLM
37
1
0
24 Mar 2023
On the Utility of Equal Batch Sizes for Inference in Stochastic Gradient Descent
Rahul Singh
A. Shukla
Dootika Vats
27
0
0
14 Mar 2023
Bayesian Learning for Neural Networks: an algorithmic survey
M. Magris
Alexandros Iosifidis
BDL
DRL
35
68
0
21 Nov 2022
On the Algorithmic Stability and Generalization of Adaptive Optimization Methods
Han Nguyen
Hai Pham
Sashank J. Reddi
Barnabás Póczos
ODL
AI4CE
17
2
0
08 Nov 2022
Adaptive scaling of the learning rate by second order automatic differentiation
F. Gournay
Alban Gossard
ODL
23
1
0
26 Oct 2022
HesScale: Scalable Computation of Hessian Diagonals
Mohamed Elsayed
A. R. Mahmood
14
7
0
20 Oct 2022
MaskTune: Mitigating Spurious Correlations by Forcing to Explore
Saeid Asgari Taghanaki
Aliasghar Khani
Fereshte Khani
A. Gholami
Linh-Tam Tran
Ali Mahdavi-Amiri
Ghassan Hamarneh
AAML
41
45
0
30 Sep 2022
Provable Acceleration of Heavy Ball beyond Quadratics for a Class of Polyak-Łojasiewicz Functions when the Non-Convexity is Averaged-Out
Jun-Kun Wang
Chi-Heng Lin
Andre Wibisono
Bin Hu
29
20
0
22 Jun 2022
Efficient-Adam: Communication-Efficient Distributed Adam
Congliang Chen
Li Shen
Wei Liu
Z. Luo
23
19
0
28 May 2022
One-Pixel Shortcut: on the Learning Preference of Deep Neural Networks
Shutong Wu
Sizhe Chen
Cihang Xie
X. Huang
AAML
45
27
0
24 May 2022
Benchmarking Deep AUROC Optimization: Loss Functions and Algorithmic Choices
Dixian Zhu
Xiaodong Wu
Tianbao Yang
35
10
0
27 Mar 2022
A DNN Optimizer that Improves over AdaBelief by Suppression of the Adaptive Stepsize Range
Guoqiang Zhang
Kenta Niwa
W. Kleijn
ODL
16
2
0
24 Mar 2022
An Adaptive Gradient Method with Energy and Momentum
Hailiang Liu
Xuping Tian
ODL
18
9
0
23 Mar 2022
Adaptive Gradient Methods with Local Guarantees
Zhou Lu
Wenhan Xia
Sanjeev Arora
Elad Hazan
ODL
22
9
0
02 Mar 2022
Optimal learning rate schedules in high-dimensional non-convex optimization problems
Stéphane dÁscoli
Maria Refinetti
Giulio Biroli
16
7
0
09 Feb 2022
Understanding AdamW through Proximal Methods and Scale-Freeness
Zhenxun Zhuang
Mingrui Liu
Ashok Cutkosky
Francesco Orabona
37
63
0
31 Jan 2022
A Stochastic Bundle Method for Interpolating Networks
Alasdair Paren
Leonard Berrada
Rudra P. K. Poudel
M. P. Kumar
24
4
0
29 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
A novel control method for solving high-dimensional Hamiltonian systems through deep neural networks
Shaolin Ji
S. Peng
Ying Peng
Xichuan Zhang
17
1
0
04 Nov 2021
Skin Cancer Classification using Inception Network and Transfer Learning
Priscilla Benedetti
Damiano Perri
Marco Simonetti
O. Gervasi
G. Reali
M. Femminella
16
11
0
03 Nov 2021
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
24
14
0
01 Nov 2021
Towards Model Agnostic Federated Learning Using Knowledge Distillation
A. Afonin
Sai Praneeth Karimireddy
FedML
30
44
0
28 Oct 2021
Training Deep Neural Networks with Adaptive Momentum Inspired by the Quadratic Optimization
Tao Sun
Huaming Ling
Zuoqiang Shi
Dongsheng Li
Bao Wang
ODL
22
13
0
18 Oct 2021
Improving Hyperparameter Optimization by Planning Ahead
H. Jomaa
Jonas K. Falkner
Lars Schmidt-Thieme
22
0
0
15 Oct 2021
Spectral Bias in Practice: The Role of Function Frequency in Generalization
Sara Fridovich-Keil
Raphael Gontijo-Lopes
Rebecca Roelofs
38
28
0
06 Oct 2021
L
2
^{2}
2
NAS: Learning to Optimize Neural Architectures via Continuous-Action Reinforcement Learning
Keith G. Mills
Fred X. Han
Mohammad Salameh
Seyed Saeed Changiz Rezaei
Linglong Kong
Wei Lu
Shuo Lian
Shangling Jui
Di Niu
22
10
0
25 Sep 2021
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
Difan Zou
Yuan Cao
Yuanzhi Li
Quanquan Gu
MLT
AI4CE
44
38
0
25 Aug 2021
Logit Attenuating Weight Normalization
Aman Gupta
R. Ramanath
Jun Shi
Anika Ramachandran
Sirou Zhou
Mingzhou Zhou
S. Keerthi
34
1
0
12 Aug 2021
On-Device Content Moderation
Anchal Pandey
Sukumar Moharana
D. Mohanty
Archit Panwar
D. Agarwal
S. Thota
19
7
0
25 Jul 2021
The Bayesian Learning Rule
Mohammad Emtiyaz Khan
Håvard Rue
BDL
57
73
0
09 Jul 2021
How Do Adam and Training Strategies Help BNNs Optimization?
Zechun Liu
Zhiqiang Shen
Shichao Li
K. Helwegen
Dong Huang
Kwang-Ting Cheng
ODL
MQ
22
82
0
21 Jun 2021
Effective Evaluation of Deep Active Learning on Image Classification Tasks
Nathan Beck
D. Sivasubramanian
Apurva Dani
Ganesh Ramakrishnan
Rishabh K. Iyer
VLM
12
37
0
16 Jun 2021
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective
Kushal Chakrabarti
Nikhil Chopra
ODL
AI4CE
29
9
0
31 May 2021
Model Selection's Disparate Impact in Real-World Deep Learning Applications
Jessica Zosa Forde
A. Feder Cooper
Kweku Kwegyir-Aggrey
Chris De Sa
Michael Littman
11
22
0
01 Apr 2021
Intraclass clustering: an implicit learning ability that regularizes DNNs
Simon Carbonnelle
Christophe De Vleeschouwer
43
8
0
11 Mar 2021
CMDNet: Learning a Probabilistic Relaxation of Discrete Variables for Soft Detection with Low Complexity
Edgar Beck
C. Bockelmann
Armin Dekorsy
18
12
0
25 Feb 2021
Enabling Binary Neural Network Training on the Edge
Erwei Wang
James J. Davis
Daniele Moro
Piotr Zielinski
Jia Jie Lim
C. Coelho
S. Chatterjee
P. Cheung
G. Constantinides
MQ
20
24
0
08 Feb 2021
1
2
3
Next