Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.04836
Cited By
v1
v2 (latest)
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
15 September 2016
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"
50 / 1,554 papers shown
Title
Parallel Blockwise Knowledge Distillation for Deep Neural Network Compression
Cody Blakeney
Xiaomin Li
Yan Yan
Ziliang Zong
93
41
0
05 Dec 2020
Representation Based Complexity Measures for Predicting Generalization in Deep Learning
Parth Natekar
Manik Sharma
56
36
0
04 Dec 2020
Accumulated Decoupled Learning: Mitigating Gradient Staleness in Inter-Layer Model Parallelization
Huiping Zhuang
Zhiping Lin
Kar-Ann Toh
119
4
0
03 Dec 2020
FairBatch: Batch Selection for Model Fairness
Yuji Roh
Kangwook Lee
Steven Euijong Whang
Changho Suh
VLM
99
133
0
03 Dec 2020
Dynamic Curriculum Learning for Low-Resource Neural Machine Translation
Chen Xu
Bojie Hu
Yufan Jiang
Kai Feng
Zeyang Wang
Shen Huang
Qi Ju
Tong Xiao
Jingbo Zhu
101
22
0
30 Nov 2020
Is Support Set Diversity Necessary for Meta-Learning?
Amrith Rajagopal Setlur
Oscar Li
Virginia Smith
104
16
0
28 Nov 2020
Gradient Descent for Deep Matrix Factorization: Dynamics and Implicit Bias towards Low Rank
H. Chou
Carsten Gieshoff
J. Maly
Holger Rauhut
81
42
0
27 Nov 2020
Implicit bias of deep linear networks in the large learning rate phase
Wei Huang
Weitao Du
R. Xu
Chunrui Liu
67
2
0
25 Nov 2020
Long Short Term Memory Networks for Bandwidth Forecasting in Mobile Broadband Networks under Mobility
Konstantinos Kousias
A. Pappas
Özgü Alay
A. Argyriou
Michael Riegler
33
1
0
20 Nov 2020
Contrastive Weight Regularization for Large Minibatch SGD
Qiwei Yuan
Weizhe Hua
Yi Zhou
Cunxi Yu
OffRL
83
1
0
17 Nov 2020
EvoPose2D: Pushing the Boundaries of 2D Human Pose Estimation using Accelerated Neuroevolution with Weight Transfer
William J. McNally
Kanav Vats
Alexander Wong
J. McPhee
3DH
70
16
0
17 Nov 2020
A Random Matrix Theory Approach to Damping in Deep Learning
Diego Granziol
Nicholas P. Baskerville
AI4CE
ODL
86
2
0
15 Nov 2020
Roof fall hazard detection with convolutional neural networks using transfer learning
E. Isleyen
Sebnem Duzgun
McKell R. Carter
23
3
0
12 Nov 2020
Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian
Jack Parker-Holder
Luke Metz
Cinjon Resnick
Hengyuan Hu
Adam Lerer
Alistair Letcher
A. Peysakhovich
Aldo Pacchiano
Jakob N. Foerster
42
24
0
12 Nov 2020
Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting
Zeke Xie
Fengxiang He
Shaopeng Fu
Issei Sato
Dacheng Tao
Masashi Sugiyama
58
61
0
12 Nov 2020
Privacy Preservation in Federated Learning: An insightful survey from the GDPR Perspective
N. Truong
Kai Sun
Siyao Wang
Florian Guitton
Yike Guo
FedML
65
9
0
10 Nov 2020
SALR: Sharpness-aware Learning Rate Scheduler for Improved Generalization
Xubo Yue
Maher Nouiehed
Raed Al Kontar
ODL
38
4
0
10 Nov 2020
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate
Jingfeng Wu
Difan Zou
Vladimir Braverman
Quanquan Gu
102
18
0
04 Nov 2020
SGB: Stochastic Gradient Bound Method for Optimizing Partition Functions
Junchang Wang
A. Choromańska
45
0
0
03 Nov 2020
Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour
Arissa Wongpanich
Hieu H. Pham
J. Demmel
Mingxing Tan
Quoc V. Le
Yang You
Sameer Kumar
78
8
0
30 Oct 2020
Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification
Saurabh Agarwal
Hongyi Wang
Kangwook Lee
Shivaram Venkataraman
Dimitris Papailiopoulos
85
25
0
29 Oct 2020
Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel
Stanislav Fort
Gintare Karolina Dziugaite
Mansheej Paul
Sepideh Kharaghani
Daniel M. Roy
Surya Ganguli
111
193
0
28 Oct 2020
Accelerating Training of Transformer-Based Language Models with Progressive Layer Dropping
Minjia Zhang
Yuxiong He
AI4CE
48
104
0
26 Oct 2020
Structural Prior Driven Regularized Deep Learning for Sonar Image Classification
Isaac D. Gerg
V. Monga
18
33
0
26 Oct 2020
Train simultaneously, generalize better: Stability of gradient-based minimax learners
Farzan Farnia
Asuman Ozdaglar
73
48
0
23 Oct 2020
Progressive Batching for Efficient Non-linear Least Squares
Huu Le
Christopher Zach
E. Rosten
Oliver J. Woodford
54
3
0
21 Oct 2020
What About Inputing Policy in Value Function: Policy Representation and Policy-extended Value Function Approximator
Hongyao Tang
Zhaopeng Meng
Jianye Hao
Chong Chen
D. Graves
...
Hangyu Mao
Wulong Liu
Yaodong Yang
Wenyuan Tao
Li Wang
OffRL
71
7
0
19 Oct 2020
Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism
Vipul Gupta
Dhruv Choudhary
P. T. P. Tang
Xiaohan Wei
Xing Wang
Yuzhen Huang
A. Kejariwal
Kannan Ramchandran
Michael W. Mahoney
91
33
0
18 Oct 2020
Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout
Zhao Chen
Jiquan Ngiam
Yanping Huang
Thang Luong
Henrik Kretzschmar
Yuning Chai
Dragomir Anguelov
90
221
0
14 Oct 2020
How does Weight Correlation Affect the Generalisation Ability of Deep Neural Networks
Gao Jin
Xinping Yi
Liang Zhang
Lijun Zhang
S. Schewe
Xiaowei Huang
80
42
0
12 Oct 2020
Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning
Pan Zhou
Jiashi Feng
Chao Ma
Caiming Xiong
Guosheng Lin
E. Weinan
101
235
0
12 Oct 2020
A Deep Learning Framework for Predicting Digital Asset Price Movement from Trade-by-trade Data
Qi Zhao
62
3
0
11 Oct 2020
Regularizing Neural Networks via Adversarial Model Perturbation
Yaowei Zheng
Richong Zhang
Yongyi Mao
AAML
107
99
0
10 Oct 2020
Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks
Yikai Wu
Xingyu Zhu
Chenwei Wu
Annie Wang
Rong Ge
118
45
0
08 Oct 2020
Towards a Scalable and Distributed Infrastructure for Deep Learning Applications
Bita Hasheminezhad
S. Shirzad
Nanmiao Wu
Patrick Diehl
Hannes Schulz
Hartmut Kaiser
GNN
AI4CE
85
4
0
06 Oct 2020
Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate
Zhiyuan Li
Kaifeng Lyu
Sanjeev Arora
112
75
0
06 Oct 2020
Accurate, Efficient and Scalable Training of Graph Neural Networks
Hanqing Zeng
Hongkuan Zhou
Ajitesh Srivastava
Rajgopal Kannan
Viktor Prasanna
GNN
30
9
0
05 Oct 2020
Regularizing Dialogue Generation by Imitating Implicit Scenarios
Shaoxiong Feng
Xuancheng Ren
Hongshen Chen
Bin Sun
Kan Li
Xu Sun
95
20
0
05 Oct 2020
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
207
1,360
0
03 Oct 2020
Effective Regularization Through Loss-Function Metalearning
Santiago Gonzalez
Xin Qiu
Risto Miikkulainen
128
5
0
02 Oct 2020
CASTLE: Regularization via Auxiliary Causal Graph Discovery
Trent Kyono
Yao Zhang
M. Schaar
OOD
CML
70
69
0
28 Sep 2020
Improved generalization by noise enhancement
Takashi Mori
Masahito Ueda
42
3
0
28 Sep 2020
Learning Optimal Representations with the Decodable Information Bottleneck
Yann Dubois
Douwe Kiela
D. Schwab
Ramakrishna Vedantam
115
43
0
27 Sep 2020
Implicit Gradient Regularization
David Barrett
Benoit Dherin
101
152
0
23 Sep 2020
Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't
E. Weinan
Chao Ma
Stephan Wojtowytsch
Lei Wu
AI4CE
125
134
0
22 Sep 2020
VirtualFlow: Decoupling Deep Learning Models from the Underlying Hardware
Andrew Or
Haoyu Zhang
M. Freedman
73
10
0
20 Sep 2020
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
197
80
0
17 Sep 2020
Analysis of Generalizability of Deep Neural Networks Based on the Complexity of Decision Boundary
Shuyue Guan
Murray H. Loew
87
29
0
16 Sep 2020
Collaborative Group Learning
Shaoxiong Feng
Hongshen Chen
Xuancheng Ren
Zhuoye Ding
Kan Li
Xu Sun
61
8
0
16 Sep 2020
Deforming the Loss Surface to Affect the Behaviour of the Optimizer
Liangming Chen
Long Jin
Xiujuan Du
Shuai Li
Mei Liu
ODL
21
2
0
14 Sep 2020
Previous
1
2
3
...
19
20
21
...
30
31
32
Next