Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.08292
Cited By
The Marginal Value of Adaptive Gradient Methods in Machine Learning
23 May 2017
Ashia C. Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Marginal Value of Adaptive Gradient Methods in Machine Learning"
27 / 127 papers shown
Title
When Semi-Supervised Learning Meets Transfer Learning: Training Strategies, Models and Datasets
Hong-Yu Zhou
Avital Oliver
Jianxin Wu
Yefeng Zheng
24
22
0
13 Dec 2018
Weakly Supervised Estimation of Shadow Confidence Maps in Fetal Ultrasound Imaging
Qingjie Meng
Matthew Sinclair
V. Zimmer
Benjamin Hou
Martin Rajchl
...
J. Housden
Jacqueline Matthew
Daniel Rueckert
J. Schnabel
Bernhard Kainz
16
1
0
20 Nov 2018
Optimal Adaptive and Accelerated Stochastic Gradient Descent
Qi Deng
Yi Cheng
Guanghui Lan
8
8
0
01 Oct 2018
AdaShift: Decorrelation and Convergence of Adaptive Learning Rate Methods
Zhiming Zhou
Qingru Zhang
Guansong Lu
Hongwei Wang
Weinan Zhang
Yong Yu
16
66
0
29 Sep 2018
Ensemble Kalman Inversion: A Derivative-Free Technique For Machine Learning Tasks
Nikola B. Kovachki
Andrew M. Stuart
BDL
42
136
0
10 Aug 2018
Face-Cap: Image Captioning using Facial Expression Analysis
Omid Mohamad Nezami
Mark Dras
Peter Anderson
Len Hamey
CVBM
19
27
0
06 Jul 2018
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
Jinghui Chen
Dongruo Zhou
Yiqi Tang
Ziyan Yang
Yuan Cao
Quanquan Gu
ODL
19
192
0
18 Jun 2018
Data augmentation instead of explicit regularization
Alex Hernández-García
Peter König
30
141
0
11 Jun 2018
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
Rachel A. Ward
Xiaoxia Wu
Léon Bottou
ODL
19
358
0
05 Jun 2018
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark
Cody Coleman
Daniel Kang
Deepak Narayanan
Luigi Nardi
Tian Zhao
Jian Zhang
Peter Bailis
K. Olukotun
Christopher Ré
Matei A. Zaharia
13
117
0
04 Jun 2018
Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization
Navid Azizan
B. Hassibi
16
61
0
04 Jun 2018
Nonlinear Acceleration of CNNs
Damien Scieur
Edouard Oyallon
Alexandre d’Aspremont
Francis R. Bach
7
11
0
01 Jun 2018
Optimal ridge penalty for real-world high-dimensional data can be zero or negative due to the implicit ridge regularization
D. Kobak
Jonathan Lomond
Benoit Sanchez
30
89
0
28 May 2018
Non-Vacuous Generalization Bounds at the ImageNet Scale: A PAC-Bayesian Compression Approach
Wenda Zhou
Victor Veitch
Morgane Austern
Ryan P. Adams
Peter Orbanz
32
209
0
16 Apr 2018
On the importance of single directions for generalization
Ari S. Morcos
David Barrett
Neil C. Rabinowitz
M. Botvinick
13
328
0
19 Mar 2018
Characterizing Implicit Bias in Terms of Optimization Geometry
Suriya Gunasekar
Jason D. Lee
Daniel Soudry
Nathan Srebro
AI4CE
35
398
0
22 Feb 2018
signSGD: Compressed Optimisation for Non-Convex Problems
Jeremy Bernstein
Yu-Xiang Wang
Kamyar Azizzadenesheli
Anima Anandkumar
FedML
ODL
35
1,019
0
13 Feb 2018
Fix your classifier: the marginal value of training the last weight layer
Elad Hoffer
Itay Hubara
Daniel Soudry
27
101
0
14 Jan 2018
Momentum and Stochastic Momentum for Stochastic Gradient, Newton, Proximal Point and Subspace Descent Methods
Nicolas Loizou
Peter Richtárik
17
199
0
27 Dec 2017
Improving Generalization Performance by Switching from Adam to SGD
N. Keskar
R. Socher
ODL
27
520
0
20 Dec 2017
Predicting Adolescent Suicide Attempts with Neural Networks
Harish S. Bhat
S. Goldman-Mellor
21
26
0
28 Nov 2017
Decoupled Weight Decay Regularization
I. Loshchilov
Frank Hutter
OffRL
36
2,078
0
14 Nov 2017
Regularizing and Optimizing LSTM Language Models
Stephen Merity
N. Keskar
R. Socher
60
1,091
0
07 Aug 2017
Neural Sequence Model Training via
α
α
α
-divergence Minimization
Sotetsu Koyamada
Yuta Kikuchi
Atsunori Kanemura
S. Maeda
S. Ishii
65
0
0
30 Jun 2017
Stochastic Training of Neural Networks via Successive Convex Approximations
Simone Scardapane
P. Di Lorenzo
16
9
0
15 Jun 2017
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
104
1,502
0
25 Jan 2017
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,889
0
15 Sep 2016
Previous
1
2
3