Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.04836
Cited By
v1
v2 (latest)
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
15 September 2016
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"
50 / 1,554 papers shown
Title
Papaya: Practical, Private, and Scalable Federated Learning
Dzmitry Huba
John Nguyen
Kshitiz Malik
Ruiyu Zhu
Michael G. Rabbat
...
H. Srinivas
Kaikai Wang
Anthony Shoumikhin
Jesik Min
Mani Malek
FedML
152
141
0
08 Nov 2021
Exponential escape efficiency of SGD from sharp minima in non-stationary regime
Hikaru Ibayashi
Masaaki Imaizumi
97
4
0
07 Nov 2021
Dropout in Training Neural Networks: Flatness of Solution and Noise Structure
Zhongwang Zhang
Hanxu Zhou
Zhi-Qin John Xu
ODL
63
2
0
01 Nov 2021
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
83
15
0
01 Nov 2021
GBK-GNN: Gated Bi-Kernel Graph Neural Networks for Modeling Both Homophily and Heterophily
Lun Du
Xiaozhou Shi
Qiang Fu
Xiaojun Ma
Hengyu Liu
Shi Han
Dongmei Zhang
127
114
0
29 Oct 2021
CAP: Co-Adversarial Perturbation on Weights and Features for Improving Generalization of Graph Neural Networks
Hao Xue
Kaixiong Zhou
Tianlong Chen
Kai Guo
Helen Zhou
Yi Chang
Xin Wang
AAML
71
15
0
28 Oct 2021
Masked LARk: Masked Learning, Aggregation and Reporting worKflow
Joseph J. Pfeiffer
Denis Xavier Charles
Davis Gilton
Young Hun Jung
Mehul Parsana
Erik Anderson
74
11
0
27 Oct 2021
Multilayer Lookahead: a Nested Version of Lookahead
Denys Pushkin
Luis Barba
97
1
0
27 Oct 2021
RoMA: Robust Model Adaptation for Offline Model-based Optimization
Sihyun Yu
SungSoo Ahn
Le Song
Jinwoo Shin
OffRL
95
36
0
27 Oct 2021
Optimizing Information-theoretical Generalization Bounds via Anisotropic Noise in SGLD
Bohan Wang
Huishuai Zhang
Jieyu Zhang
Qi Meng
Wei Chen
Tie-Yan Liu
22
1
0
26 Oct 2021
Stable Anderson Acceleration for Deep Learning
Massimiliano Lupo Pasini
Junqi Yin
Viktor Reshniak
M. Stoyanov
59
4
0
26 Oct 2021
Generalized Resubstitution for Classification Error Estimation
P. Ghane
U. Braga-Neto
16
2
0
23 Oct 2021
Feature Learning and Signal Propagation in Deep Neural Networks
Yizhang Lou
Chris Mingard
Yoonsoo Nam
Soufiane Hayou
MDE
82
18
0
22 Oct 2021
Boosting Resource-Constrained Federated Learning Systems with Guessed Updates
Mohamed Yassine Boukhari
Akash Dhasade
Anne-Marie Kermarrec
Rafael Pires
Othmane Safsafi
Rishi Sharma
FedML
75
0
0
21 Oct 2021
Test time Adaptation through Perturbation Robustness
Prabhu Teja Sivaprasad
Franccois Fleuret
TTA
OOD
69
34
0
19 Oct 2021
Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri
H. Mobahi
Yi Tay
182
104
0
16 Oct 2021
Trade-offs of Local SGD at Scale: An Empirical Study
Jose Javier Gonzalez Ortiz
Jonathan Frankle
Michael G. Rabbat
Ari S. Morcos
Nicolas Ballas
FedML
86
18
0
15 Oct 2021
What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
Zhiyuan Li
Tianhao Wang
Sanjeev Arora
MLT
121
105
0
13 Oct 2021
The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks
R. Entezari
Hanie Sedghi
O. Saukh
Behnam Neyshabur
MoMe
102
238
0
12 Oct 2021
Not all noise is accounted equally: How differentially private learning benefits from large sampling rates
Friedrich Dörmann
Osvald Frisk
L. Andersen
Christian Fischer Pedersen
FedML
98
25
0
12 Oct 2021
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations
Jiayao Zhang
Hua Wang
Weijie J. Su
96
8
0
11 Oct 2021
Observations on K-image Expansion of Image-Mixing Augmentation for Classification
Joonhyun Jeong
Sungmin Cha
Jongwon Choi
Sangdoo Yun
Taesup Moon
Y. Yoo
VLM
90
7
0
08 Oct 2021
Does Momentum Change the Implicit Regularization on Separable Data?
Bohan Wang
Qi Meng
Huishuai Zhang
Ruoyu Sun
Wei Chen
Zhirui Ma
Tie-Yan Liu
99
18
0
08 Oct 2021
Efficient Sharpness-aware Minimization for Improved Training of Neural Networks
Jiawei Du
Hanshu Yan
Jiashi Feng
Qiufeng Wang
Liangli Zhen
Rick Siow Mong Goh
Vincent Y. F. Tan
AAML
177
135
0
07 Oct 2021
Label Noise in Adversarial Training: A Novel Perspective to Study Robust Overfitting
Chengyu Dong
Liyuan Liu
Jingbo Shang
NoLa
AAML
119
20
0
07 Oct 2021
On the Generalization of Models Trained with SGD: Information-Theoretic Bounds and Implications
Ziqiao Wang
Yongyi Mao
FedML
MLT
124
26
0
07 Oct 2021
Spectral Bias in Practice: The Role of Function Frequency in Generalization
Sara Fridovich-Keil
Raphael Gontijo-Lopes
Rebecca Roelofs
107
30
0
06 Oct 2021
Perturbated Gradients Updating within Unit Space for Deep Learning
Ching-Hsun Tseng
Liu Cheng
Shin-Jye Lee
Xiaojun Zeng
111
5
0
01 Oct 2021
Accelerating Encrypted Computing on Intel GPUs
Yujia Zhai
Mohannad Ibrahim
Yiqin Qiu
Fabian Boemer
Zizhong Chen
Alexey Titov
Alexander Lyashevsky
130
26
0
29 Sep 2021
Second-Order Neural ODE Optimizer
Guan-Horng Liu
T. Chen
Evangelos A. Theodorou
77
15
0
29 Sep 2021
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
173
76
0
29 Sep 2021
Scalable deeper graph neural networks for high-performance materials property prediction
Sadman Sadeed Omee
Steph-Yves M. Louis
Nihang Fu
Lai Wei
Sourin Dey
Rongzhi Dong
Qinyang Li
Jianjun Hu
132
77
0
25 Sep 2021
Towards Generalized and Incremental Few-Shot Object Detection
Yiting Li
H. Zhu
Jun Ma
C. Teo
Chen Xiang
P. Vadakkepat
T. Lee
CLL
ObjD
66
9
0
23 Sep 2021
Patch-based Medical Image Segmentation using Matrix Product State Tensor Networks
Raghavendra Selvan
Erik Dam
Soren Alexander Flensborg
Jens Petersen
MedIm
98
2
0
15 Sep 2021
DHA: End-to-End Joint Optimization of Data Augmentation Policy, Hyper-parameter and Architecture
Kaichen Zhou
Lanqing Hong
Shuailiang Hu
Fengwei Zhou
Binxin Ru
Jiashi Feng
Zhenguo Li
84
10
0
13 Sep 2021
Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning
Runxin Xu
Fuli Luo
Zhiyuan Zhang
Chuanqi Tan
Baobao Chang
Songfang Huang
Fei Huang
LRM
193
190
0
13 Sep 2021
A Continuous Optimisation Benchmark Suite from Neural Network Regression
K. Malan
C. Cleghorn
ODL
39
1
0
12 Sep 2021
MLReal: Bridging the gap between training on synthetic data and real data applications in machine learning
T. Alkhalifah
Hanchen Wang
O. Ovcharenko
OOD
99
68
0
11 Sep 2021
Adversarial Parameter Defense by Multi-Step Risk Minimization
Zhiyuan Zhang
Ruixuan Luo
Xuancheng Ren
Qi Su
Liangyou Li
Xu Sun
AAML
64
6
0
07 Sep 2021
Deep Convolutional Neural Networks Predict Elasticity Tensors and their Bounds in Homogenization
B. Eidel
3DV
35
2
0
04 Sep 2021
How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data
Zhiyuan Zhang
Lingjuan Lyu
Weiqiang Wang
Lichao Sun
Xu Sun
86
36
0
03 Sep 2021
The Impact of Reinitialization on Generalization in Convolutional Neural Networks
Ibrahim Alabdulmohsin
Hartmut Maennel
Daniel Keysers
AI4CE
61
21
0
01 Sep 2021
HAT4RD: Hierarchical Adversarial Training for Rumor Detection on Social Media
Shiwen Ni
Jiawen Li
Hung-Yu kao
72
7
0
29 Aug 2021
DropAttack: A Masked Weight Adversarial Training Method to Improve Generalization of Neural Networks
Shiwen Ni
Jiawen Li
Hung-Yu kao
AAML
61
4
0
29 Aug 2021
Re-using Adversarial Mask Discriminators for Test-time Training under Distribution Shifts
Gabriele Valvano
Andrea Leo
Sotirios A. Tsaftaris
74
6
0
26 Aug 2021
Measurement of Hybrid Rocket Solid Fuel Regression Rate for a Slab Burner using Deep Learning
Gabriel Surina
G. Georgalis
Siddhant S. Aphale
A. Patra
P. DesJardin
8
11
0
25 Aug 2021
Shift-Curvature, SGD, and Generalization
Arwen V. Bradley
C. Gomez-Uribe
Manish Reddy Vuyyuru
62
3
0
21 Aug 2021
Learning from Images: Proactive Caching with Parallel Convolutional Neural Networks
Yantong Wang
Ye Hu
Zhaohui Yang
Walid Saad
Kai‐Kit Wong
V. Friderikos
134
4
0
15 Aug 2021
Implicit Regularization of Bregman Proximal Point Algorithm and Mirror Descent on Separable Data
Yan Li
Caleb Ju
Ethan X. Fang
T. Zhao
69
9
0
15 Aug 2021
Logit Attenuating Weight Normalization
Aman Gupta
R. Ramanath
Jun Shi
Anika Ramachandran
Sirou Zhou
Mingzhou Zhou
S. Keerthi
75
1
0
12 Aug 2021
Previous
1
2
3
...
15
16
17
...
30
31
32
Next