Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.08292
Cited By
The Marginal Value of Adaptive Gradient Methods in Machine Learning
23 May 2017
Ashia C. Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Marginal Value of Adaptive Gradient Methods in Machine Learning"
50 / 127 papers shown
Title
Advances in Electron Microscopy with Deep Learning
Jeffrey M. Ede
32
2
0
04 Jan 2021
An Adaptive Memory Multi-Batch L-BFGS Algorithm for Neural Network Training
Federico Zocco
Seán F. McLoone
ODL
14
4
0
14 Dec 2020
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks
Bohan Wang
Qi Meng
Wei Chen
Tie-Yan Liu
22
33
0
11 Dec 2020
HEBO Pushing The Limits of Sample-Efficient Hyperparameter Optimisation
Alexander I. Cowen-Rivers
Wenlong Lyu
Rasul Tutunov
Zhi Wang
Antoine Grosnit
...
A. Maraval
Hao Jianye
Jun Wang
Jan Peters
H. Ammar
27
74
0
07 Dec 2020
Sequential convergence of AdaGrad algorithm for smooth convex optimization
Cheik Traoré
Edouard Pauwels
9
21
0
24 Nov 2020
Design Space for Graph Neural Networks
Jiaxuan You
Rex Ying
J. Leskovec
GNN
AI4CE
27
315
0
17 Nov 2020
A Random Matrix Theory Approach to Damping in Deep Learning
Diego Granziol
Nicholas P. Baskerville
AI4CE
ODL
26
2
0
15 Nov 2020
AEGD: Adaptive Gradient Descent with Energy
Hailiang Liu
Xuping Tian
ODL
27
11
0
10 Oct 2020
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
62
1,276
0
03 Oct 2020
Effective Regularization Through Loss-Function Metalearning
Santiago Gonzalez
Risto Miikkulainen
24
5
0
02 Oct 2020
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
31
79
0
17 Sep 2020
A Qualitative Study of the Dynamic Behavior for Adaptive Gradient Algorithms
Chao Ma
Lei Wu
E. Weinan
ODL
11
23
0
14 Sep 2020
Whitening and second order optimization both make information in the dataset unusable during training, and can reduce or prevent generalization
Neha S. Wadia
Daniel Duckworth
S. Schoenholz
Ethan Dyer
Jascha Narain Sohl-Dickstein
24
13
0
17 Aug 2020
Obtaining Adjustable Regularization for Free via Iterate Averaging
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
30
2
0
15 Aug 2020
Improving neural network predictions of material properties with limited data using transfer learning
Schuyler Krawczuk
D. Venturi
AI4CE
6
3
0
29 Jun 2020
Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks
Avi Schwarzschild
Micah Goldblum
Arjun Gupta
John P. Dickerson
Tom Goldstein
AAML
TDI
13
162
0
22 Jun 2020
When Does Preconditioning Help or Hurt Generalization?
S. Amari
Jimmy Ba
Roger C. Grosse
Xuechen Li
Atsushi Nitanda
Taiji Suzuki
Denny Wu
Ji Xu
34
32
0
18 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
J. Lee
Tengyu Ma
29
93
0
15 Jun 2020
An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias
Lu Yu
Krishnakumar Balasubramanian
S. Volgushev
Murat A. Erdogdu
35
50
0
14 Jun 2020
To Each Optimizer a Norm, To Each Norm its Generalization
Sharan Vaswani
Reza Babanezhad
Jose Gallego
Aaron Mishkin
Simon Lacoste-Julien
Nicolas Le Roux
26
8
0
11 Jun 2020
Geometry-Aware Gradient Algorithms for Neural Architecture Search
Liam Li
M. Khodak
Maria-Florina Balcan
Ameet Talwalkar
31
67
0
16 Apr 2020
Iterative Averaging in the Quest for Best Test Error
Diego Granziol
Xingchen Wan
Samuel Albanie
Stephen J. Roberts
10
3
0
02 Mar 2020
Statistical Adaptive Stochastic Gradient Methods
Pengchuan Zhang
Hunter Lang
Qiang Liu
Lin Xiao
ODL
15
11
0
25 Feb 2020
Stable Training of DNN for Speech Enhancement based on Perceptually-Motivated Black-Box Cost Function
M. Kawanaka
Yuma Koizumi
Ryoichi Miyazaki
Kohei Yatabe
AAML
19
22
0
14 Feb 2020
LaProp: Separating Momentum and Adaptivity in Adam
Liu Ziyin
Zhikang T.Wang
Masahito Ueda
ODL
8
18
0
12 Feb 2020
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
200
57
0
09 Feb 2020
FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
Kihyuk Sohn
David Berthelot
Chun-Liang Li
Zizhao Zhang
Nicholas Carlini
E. D. Cubuk
Alexey Kurakin
Han Zhang
Colin Raffel
AAML
71
3,464
0
21 Jan 2020
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets
Mingrui Liu
Youssef Mroueh
Jerret Ross
Wei Zhang
Xiaodong Cui
Payel Das
Tianbao Yang
ODL
30
63
0
26 Dec 2019
Learning Rate Dropout
Huangxing Lin
Weihong Zeng
Xinghao Ding
Yue Huang
Yihong Zhuang
John Paisley
ODL
16
9
0
30 Nov 2019
Information-Theoretic Local Minima Characterization and Regularization
Zhiwei Jia
Hao Su
19
19
0
19 Nov 2019
An Adaptive and Momental Bound Method for Stochastic Learning
Jianbang Ding
Xuancheng Ren
Ruixuan Luo
Xu Sun
ODL
11
46
0
27 Oct 2019
Demon: Improved Neural Network Training with Momentum Decay
John Chen
Cameron R. Wolfe
Zhaoqi Li
Anastasios Kyrillidis
ODL
24
15
0
11 Oct 2019
Revisiting Fine-tuning for Few-shot Learning
Akihiro Nakamura
Tatsuya Harada
32
52
0
01 Oct 2019
Empirical study towards understanding line search approximations for training neural networks
Younghwan Chae
D. Wilke
21
11
0
15 Sep 2019
How Does Learning Rate Decay Help Modern Neural Networks?
Kaichao You
Mingsheng Long
Jianmin Wang
Michael I. Jordan
20
4
0
05 Aug 2019
Adaptive Regularization via Residual Smoothing in Deep Learning Optimization
Jung-Kyun Cho
Junseok Kwon
Byung-Woo Hong
26
1
0
23 Jul 2019
Invariant Risk Minimization
Martín Arjovsky
Léon Bottou
Ishaan Gulrajani
David Lopez-Paz
OOD
30
2,154
0
05 Jul 2019
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Kaifeng Lyu
Jian Li
32
321
0
13 Jun 2019
Continuous Time Analysis of Momentum Methods
Nikola B. Kovachki
Andrew M. Stuart
13
31
0
10 Jun 2019
The Implicit Bias of AdaGrad on Separable Data
Qian Qian
Xiaoyuan Qian
21
24
0
09 Jun 2019
Reducing the variance in online optimization by transporting past gradients
Sébastien M. R. Arnold
Pierre-Antoine Manzagol
Reza Babanezhad
Ioannis Mitliagkas
Nicolas Le Roux
21
28
0
08 Jun 2019
Machine Learning and System Identification for Estimation in Physical Systems
Fredrik Bagge Carlson
OOD
16
5
0
05 Jun 2019
The Secrets of Machine Learning: Ten Things You Wish You Had Known Earlier to be More Effective at Data Analysis
Cynthia Rudin
David Carlson
HAI
17
34
0
04 Jun 2019
Why gradient clipping accelerates training: A theoretical justification for adaptivity
Junzhe Zhang
Tianxing He
S. Sra
Ali Jadbabaie
16
441
0
28 May 2019
Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models
Mor Shpigel Nacson
Suriya Gunasekar
J. Lee
Nathan Srebro
Daniel Soudry
17
91
0
17 May 2019
Reducing Noise in GAN Training with Variance Reduced Extragradient
Tatjana Chavdarova
Gauthier Gidel
F. Fleuret
Simon Lacoste-Julien
25
134
0
18 Apr 2019
A Selective Overview of Deep Learning
Jianqing Fan
Cong Ma
Yiqiao Zhong
BDL
VLM
28
136
0
10 Apr 2019
CE-Net: Context Encoder Network for 2D Medical Image Segmentation
Zaiwang Gu
Jun Cheng
Huazhu Fu
Kang Zhou
Huaying Hao
Yitian Zhao
Tianyang Zhang
Shenghua Gao
Jiang-Dong Liu
SSeg
18
1,614
0
07 Mar 2019
Segmentation of Roots in Soil with U-Net
Abraham George Smith
Jens Petersen
Raghavendra Selvan
C. Rasmussen
11
122
0
28 Feb 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
20
22
0
21 Feb 2019
Previous
1
2
3
Next