ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.04836
  4. Cited By
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
v1v2 (latest)

On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima

15 September 2016
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
    ODL
ArXiv (abs)PDFHTML

Papers citing "On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"

50 / 1,554 papers shown
Title
Combining resampling and reweighting for faithful stochastic
  optimization
Combining resampling and reweighting for faithful stochastic optimization
Jing An
Lexing Ying
39
1
0
31 May 2021
Embedding Principle of Loss Landscape of Deep Neural Networks
Embedding Principle of Loss Landscape of Deep Neural Networks
Yaoyu Zhang
Zhongwang Zhang
Yaoyu Zhang
Z. Xu
67
38
0
30 May 2021
LRTuner: A Learning Rate Tuner for Deep Neural Networks
LRTuner: A Learning Rate Tuner for Deep Neural Networks
Nikhil Iyer
V. Thejas
Nipun Kwatra
Ramachandran Ramjee
Muthian Sivathanu
ODL
50
1
0
30 May 2021
Maximizing Parallelism in Distributed Training for Huge Neural Networks
Maximizing Parallelism in Distributed Training for Huge Neural Networks
Zhengda Bian
Qifan Xu
Boxiang Wang
Yang You
MoE
55
48
0
30 May 2021
On Linear Stability of SGD and Input-Smoothness of Neural Networks
On Linear Stability of SGD and Input-Smoothness of Neural Networks
Chao Ma
Lexing Ying
MLT
66
44
0
27 May 2021
Drawing Multiple Augmentation Samples Per Image During Training
  Efficiently Decreases Test Error
Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error
Stanislav Fort
Andrew Brock
Razvan Pascanu
Soham De
Samuel L. Smith
64
32
0
27 May 2021
Blurs Behave Like Ensembles: Spatial Smoothings to Improve Accuracy,
  Uncertainty, and Robustness
Blurs Behave Like Ensembles: Spatial Smoothings to Improve Accuracy, Uncertainty, and Robustness
Namuk Park
S. Kim
UQCVAAML
93
21
0
26 May 2021
Compressing Heavy-Tailed Weight Matrices for Non-Vacuous Generalization
  Bounds
Compressing Heavy-Tailed Weight Matrices for Non-Vacuous Generalization Bounds
John Y. Shin
141
5
0
23 May 2021
Variational Quantum Classifiers Through the Lens of the Hessian
Variational Quantum Classifiers Through the Lens of the Hessian
Pinaki Sen
Amandeep Singh Bhatia
A. Bhatia
Ahmed Elbeltagi
54
25
0
21 May 2021
Probing the Effect of Selection Bias on Generalization: A Thought
  Experiment
Probing the Effect of Selection Bias on Generalization: A Thought Experiment
John K. Tsotsos
Jun Luo
CML
36
3
0
20 May 2021
Power-law escape rate of SGD
Power-law escape rate of SGD
Takashi Mori
Liu Ziyin
Kangqiao Liu
Masakuni Ueda
71
18
0
20 May 2021
Self-Supervised Learning for Fine-Grained Visual Categorization
Self-Supervised Learning for Fine-Grained Visual Categorization
Muhammad Maaz
H. Rasheed
D. Gaddam
47
2
0
18 May 2021
DoS and DDoS Mitigation Using Variational Autoencoders
DoS and DDoS Mitigation Using Variational Autoencoders
Eirik Molde Bårli
Anis Yazidi
E. Herrera-Viedma
H. Haugerud
AAMLDRL
25
16
0
14 May 2021
Neighborhood-Aware Neural Architecture Search
Neighborhood-Aware Neural Architecture Search
Xiaofang Wang
Shengcao Cao
Mengtian Li
Kris Kitani
165
6
0
13 May 2021
ResMLP: Feedforward networks for image classification with
  data-efficient training
ResMLP: Feedforward networks for image classification with data-efficient training
Hugo Touvron
Piotr Bojanowski
Mathilde Caron
Matthieu Cord
Alaaeldin El-Nouby
...
Gautier Izacard
Armand Joulin
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
VLM
84
674
0
07 May 2021
A Critical Review of Information Bottleneck Theory and its Applications to Deep Learning
Mohammad Ali Alomrani
87
2
0
07 May 2021
Noether's Learning Dynamics: Role of Symmetry Breaking in Neural
  Networks
Noether's Learning Dynamics: Role of Symmetry Breaking in Neural Networks
Hidenori Tanaka
D. Kunin
113
31
0
06 May 2021
Modulating Regularization Frequency for Efficient Compression-Aware
  Model Training
Modulating Regularization Frequency for Efficient Compression-Aware Model Training
Dongsoo Lee
S. Kwon
Byeongwook Kim
Jeongin Yun
Baeseong Park
Yongkweon Jeon
31
0
0
05 May 2021
Poisoning the Unlabeled Dataset of Semi-Supervised Learning
Poisoning the Unlabeled Dataset of Semi-Supervised Learning
Nicholas Carlini
AAML
220
68
0
04 May 2021
InfoNEAT: Information Theory-based NeuroEvolution of Augmenting
  Topologies for Side-channel Analysis
InfoNEAT: Information Theory-based NeuroEvolution of Augmenting Topologies for Side-channel Analysis
R. Acharya
F. Ganji
Domenic Forte
AAML
101
25
0
30 Apr 2021
Inspect, Understand, Overcome: A Survey of Practical Methods for AI
  Safety
Inspect, Understand, Overcome: A Survey of Practical Methods for AI Safety
Sebastian Houben
Stephanie Abrecht
Maram Akila
Andreas Bär
Felix Brockherde
...
Serin Varghese
Michael Weber
Sebastian J. Wirkert
Tim Wirtz
Matthias Woehrle
AAML
126
58
0
29 Apr 2021
How Well Does Self-Supervised Pre-Training Perform with Streaming Data?
How Well Does Self-Supervised Pre-Training Perform with Streaming Data?
Dapeng Hu
Shipeng Yan
Qizhengqiu Lu
Lanqing Hong
Hailin Hu
Yifan Zhang
Zhenguo Li
Xinchao Wang
Jiashi Feng
123
29
0
25 Apr 2021
Sync-Switch: Hybrid Parameter Synchronization for Distributed Deep
  Learning
Sync-Switch: Hybrid Parameter Synchronization for Distributed Deep Learning
Shijian Li
Oren Mangoubi
Lijie Xu
Tian Guo
101
15
0
16 Apr 2021
Rehearsal revealed: The limits and merits of revisiting samples in
  continual learning
Rehearsal revealed: The limits and merits of revisiting samples in continual learning
Eli Verwimp
Matthias De Lange
Tinne Tuytelaars
CLL
59
108
0
15 Apr 2021
PAC Bayesian Performance Guarantees for Deep (Stochastic) Networks in
  Medical Imaging
PAC Bayesian Performance Guarantees for Deep (Stochastic) Networks in Medical Imaging
Anthony Sicilia
Xingchen Zhao
Anastasia Sosnovskikh
Seong Jae Hwang
BDLUQCV
55
4
0
12 Apr 2021
Neural basis expansion analysis with exogenous variables: Forecasting
  electricity prices with NBEATSx
Neural basis expansion analysis with exogenous variables: Forecasting electricity prices with NBEATSx
Kin G. Olivares
Cristian Challu
Grzegorz Marcjasz
R. Weron
A. Dubrawski
AI4TS
97
147
0
12 Apr 2021
Epigenetic evolution of deep convolutional models
Epigenetic evolution of deep convolutional models
Alexander Hadjiivanov
Alan Blair
28
1
0
12 Apr 2021
Scalable Marginal Likelihood Estimation for Model Selection in Deep
  Learning
Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning
Alexander Immer
Matthias Bauer
Vincent Fortuin
Gunnar Rätsch
Mohammad Emtiyaz Khan
BDLUQCV
150
109
0
11 Apr 2021
SGD Implicitly Regularizes Generalization Error
SGD Implicitly Regularizes Generalization Error
Daniel A. Roberts
MLT
67
15
0
10 Apr 2021
Relating Adversarially Robust Generalization to Flat Minima
Relating Adversarially Robust Generalization to Flat Minima
David Stutz
Matthias Hein
Bernt Schiele
OOD
105
67
0
09 Apr 2021
Training Deep Neural Networks via Branch-and-Bound
Training Deep Neural Networks via Branch-and-Bound
Yuanwei Wu
Ziming Zhang
Guanghui Wang
ODL
57
0
0
05 Apr 2021
Estimating the Generalization in Deep Neural Networks via Sparsity
Estimating the Generalization in Deep Neural Networks via Sparsity
Yang Zhao
Hao Zhang
65
2
0
02 Apr 2021
Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to
  Improve Generalization
Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization
Zeke Xie
Li-xin Yuan
Zhanxing Zhu
Masashi Sugiyama
123
29
0
31 Mar 2021
Empirically explaining SGD from a line search perspective
Empirically explaining SGD from a line search perspective
Max Mutschler
A. Zell
ODLLRM
46
4
0
31 Mar 2021
Efficient Deep Learning Pipelines for Accurate Cost Estimations Over
  Large Scale Query Workload
Efficient Deep Learning Pipelines for Accurate Cost Estimations Over Large Scale Query Workload
Johan Kok
Zhi Kang
S. Tan
Feng Cheng
Shixuan Sun
Bingsheng He
81
26
0
23 Mar 2021
Student Network Learning via Evolutionary Knowledge Distillation
Student Network Learning via Evolutionary Knowledge Distillation
Kangkai Zhang
Chunhui Zhang
Shikun Li
Dan Zeng
Shiming Ge
77
85
0
23 Mar 2021
Alleviate Exposure Bias in Sequence Prediction \\ with Recurrent Neural
  Networks
Alleviate Exposure Bias in Sequence Prediction \\ with Recurrent Neural Networks
Liping Yuan
Jiangtao Feng
Xiaoqing Zheng
Xuanjing Huang
33
1
0
22 Mar 2021
Interpretable Machine Learning: Fundamental Principles and 10 Grand
  Challenges
Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges
Cynthia Rudin
Chaofan Chen
Zhi Chen
Haiyang Huang
Lesia Semenova
Chudi Zhong
FaMLAI4CELRM
240
677
0
20 Mar 2021
Conceptual capacity and effective complexity of neural networks
Conceptual capacity and effective complexity of neural networks
Lech Szymanski
B. McCane
C. Atkinson
23
1
0
13 Mar 2021
Large Batch Simulation for Deep Reinforcement Learning
Large Batch Simulation for Deep Reinforcement Learning
Brennan Shacklett
Erik Wijmans
Aleksei Petrenko
Manolis Savva
Dhruv Batra
V. Koltun
Kayvon Fatahalian
3DVOffRLAI4CE
88
26
0
12 Mar 2021
Intraclass clustering: an implicit learning ability that regularizes
  DNNs
Intraclass clustering: an implicit learning ability that regularizes DNNs
Simon Carbonnelle
Christophe De Vleeschouwer
87
8
0
11 Mar 2021
Why flatness does and does not correlate with generalization for deep
  neural networks
Why flatness does and does not correlate with generalization for deep neural networks
Shuo Zhang
Isaac Reid
Guillermo Valle Pérez
A. Louis
77
8
0
10 Mar 2021
Robustness to Pruning Predicts Generalization in Deep Neural Networks
Robustness to Pruning Predicts Generalization in Deep Neural Networks
Lorenz Kuhn
Clare Lyle
Aidan Gomez
Jonas Rothfuss
Y. Gal
91
14
0
10 Mar 2021
Stochasticity helps to navigate rough landscapes: comparing
  gradient-descent-based algorithms in the phase retrieval problem
Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem
Francesca Mignacco
Pierfrancesco Urbani
Lenka Zdeborová
93
36
0
08 Mar 2021
Pufferfish: Communication-efficient Models At No Extra Cost
Pufferfish: Communication-efficient Models At No Extra Cost
Hongyi Wang
Saurabh Agarwal
Dimitris Papailiopoulos
85
59
0
05 Mar 2021
Evaluation of Complexity Measures for Deep Learning Generalization in
  Medical Image Analysis
Evaluation of Complexity Measures for Deep Learning Generalization in Medical Image Analysis
Aleksandar Vakanski
Min Xian
28
7
0
04 Mar 2021
Critical Parameters for Scalable Distributed Learning with Large Batches
  and Asynchronous Updates
Critical Parameters for Scalable Distributed Learning with Large Batches and Asynchronous Updates
Sebastian U. Stich
Amirkeivan Mohtashami
Martin Jaggi
71
22
0
03 Mar 2021
Formalizing Generalization and Robustness of Neural Networks to Weight
  Perturbations
Formalizing Generalization and Robustness of Neural Networks to Weight Perturbations
Yu-Lin Tsai
Chia-Yi Hsu
Chia-Mu Yu
Pin-Yu Chen
AAMLOOD
56
27
0
03 Mar 2021
Hessian Eigenspectra of More Realistic Nonlinear Models
Hessian Eigenspectra of More Realistic Nonlinear Models
Zhenyu Liao
Michael W. Mahoney
95
31
0
02 Mar 2021
DPlis: Boosting Utility of Differentially Private Deep Learning via
  Randomized Smoothing
DPlis: Boosting Utility of Differentially Private Deep Learning via Randomized Smoothing
Wenxiao Wang
Tianhao Wang
Lun Wang
Nanqing Luo
Pan Zhou
Basel Alomair
R. Jia
109
16
0
02 Mar 2021
Previous
123...171819...303132
Next