Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1802.08241
Cited By
Hessian-based Analysis of Large Batch Training and Robustness to Adversaries
22 February 2018
Z. Yao
A. Gholami
Qi Lei
Kurt Keutzer
Michael W. Mahoney
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hessian-based Analysis of Large Batch Training and Robustness to Adversaries"
42 / 42 papers shown
Title
A Model Zoo on Phase Transitions in Neural Networks
Konstantin Schurholt
Léo Meynent
Yefan Zhou
Haiquan Lu
Yaoqing Yang
Damian Borth
68
0
0
25 Apr 2025
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Jim Zhao
Sidak Pal Singh
Aurélien Lucchi
AI4CE
39
0
0
04 Nov 2024
Do Sharpness-based Optimizers Improve Generalization in Medical Image Analysis?
Mohamed Hassan
Aleksandar Vakanski
Min Xian
AAML
MedIm
41
1
0
07 Aug 2024
P
2
^2
2
-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
Huihong Shi
Xin Cheng
Wendong Mao
Zhongfeng Wang
MQ
40
3
0
30 May 2024
Momentum-SAM: Sharpness Aware Minimization without Computational Overhead
Marlon Becker
Frederick Altrock
Benjamin Risse
76
5
0
22 Jan 2024
The Interpolating Information Criterion for Overparameterized Models
Liam Hodgkinson
Christopher van der Heide
Roberto Salomone
Fred Roosta
Michael W. Mahoney
20
7
0
15 Jul 2023
How to escape sharp minima with random perturbations
Kwangjun Ahn
Ali Jadbabaie
S. Sra
ODL
29
6
0
25 May 2023
Learning Rate Schedules in the Presence of Distribution Shift
Matthew Fahrbach
Adel Javanmard
Vahab Mirrokni
Pratik Worah
19
6
0
27 Mar 2023
Randomized Adversarial Training via Taylor Expansion
Gao Jin
Xinping Yi
Dengyu Wu
Ronghui Mu
Xiaowei Huang
AAML
36
34
0
19 Mar 2023
On the Overlooked Structure of Stochastic Gradients
Zeke Xie
Qian-Yuan Tang
Mingming Sun
P. Li
23
6
0
05 Dec 2022
Fairness Increases Adversarial Vulnerability
Cuong Tran
Keyu Zhu
Ferdinando Fioretto
Pascal Van Hentenryck
23
6
0
21 Nov 2022
A New Perspective for Understanding Generalization Gap of Deep Neural Networks Trained with Large Batch Sizes
O. Oyedotun
Konstantinos Papadopoulos
Djamila Aouada
AI4CE
26
11
0
21 Oct 2022
Differential Privacy and Fairness in Decisions and Learning Tasks: A Survey
Ferdinando Fioretto
Cuong Tran
Pascal Van Hentenryck
Keyu Zhu
FaML
24
60
0
16 Feb 2022
Approximate Nearest Neighbor Search under Neural Similarity Metric for Large-Scale Recommendation
Rihan Chen
Bin Liu
Han Zhu
Yao Wang
Qi Li
...
Q. hua
Junliang Jiang
Yunlong Xu
Hongbo Deng
Bo Zheng
23
20
0
14 Feb 2022
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
42
8
0
31 Jan 2022
GOSH: Task Scheduling Using Deep Surrogate Models in Fog Computing Environments
Shreshth Tuli
G. Casale
N. Jennings
24
21
0
16 Dec 2021
Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning
Matías Mendieta
Taojiannan Yang
Pu Wang
Minwoo Lee
Zhengming Ding
C. L. P. Chen
FedML
19
158
0
28 Nov 2021
Characterizing possible failure modes in physics-informed neural networks
Aditi S. Krishnapriyan
A. Gholami
Shandian Zhe
Robert M. Kirby
Michael W. Mahoney
PINN
AI4CE
25
607
0
02 Sep 2021
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II
Yossi Arjevani
M. Field
28
18
0
21 Jul 2021
Implicit Gradient Alignment in Distributed and Federated Learning
Yatin Dandi
Luis Barba
Martin Jaggi
FedML
18
31
0
25 Jun 2021
Concurrent Adversarial Learning for Large-Batch Training
Yong Liu
Xiangning Chen
Minhao Cheng
Cho-Jui Hsieh
Yang You
ODL
28
13
0
01 Jun 2021
Relating Adversarially Robust Generalization to Flat Minima
David Stutz
Matthias Hein
Bernt Schiele
OOD
24
65
0
09 Apr 2021
On the Utility of Gradient Compression in Distributed Training Systems
Saurabh Agarwal
Hongyi Wang
Shivaram Venkataraman
Dimitris Papailiopoulos
23
46
0
28 Feb 2021
A Random Matrix Theory Approach to Damping in Deep Learning
Diego Granziol
Nicholas P. Baskerville
AI4CE
ODL
24
2
0
15 Nov 2020
Lipschitz Recurrent Neural Networks
N. Benjamin Erichson
Omri Azencot
A. Queiruga
Liam Hodgkinson
Michael W. Mahoney
28
107
0
22 Jun 2020
On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them
Chen Liu
Mathieu Salzmann
Tao R. Lin
Ryota Tomioka
Sabine Süsstrunk
AAML
19
81
0
15 Jun 2020
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
Stanislaw Jastrzebski
Maciej Szymczak
Stanislav Fort
Devansh Arpit
Jacek Tabor
Kyunghyun Cho
Krzysztof J. Geras
40
154
0
21 Feb 2020
A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima
Zeke Xie
Issei Sato
Masashi Sugiyama
ODL
20
17
0
10 Feb 2020
Analysis of Random Perturbations for Robust Convolutional Neural Networks
Adam Dziedzic
S. Krishnan
OOD
AAML
16
1
0
08 Feb 2020
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Zhen Dong
Z. Yao
Yaohui Cai
Daiyaan Arfeen
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
26
274
0
10 Nov 2019
Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin
Colin Wei
Tengyu Ma
AAML
OOD
30
85
0
09 Oct 2019
Towards Understanding the Transferability of Deep Representations
Hong Liu
Mingsheng Long
Jianmin Wang
Michael I. Jordan
21
25
0
26 Sep 2019
Understanding and Robustifying Differentiable Architecture Search
Arber Zela
T. Elsken
Tonmoy Saikia
Yassine Marrakchi
Thomas Brox
Frank Hutter
OOD
AAML
66
366
0
20 Sep 2019
How Does Learning Rate Decay Help Modern Neural Networks?
Kaichao You
Mingsheng Long
Jianmin Wang
Michael I. Jordan
20
4
0
05 Aug 2019
The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks
Ryo Karakida
S. Akaho
S. Amari
19
39
0
07 Jun 2019
No Peek: A Survey of private distributed deep learning
Praneeth Vepakomma
Tristan Swedish
Ramesh Raskar
O. Gupta
Abhimanyu Dubey
SyDa
FedML
22
99
0
08 Dec 2018
Parameter Re-Initialization through Cyclical Batch Size Schedules
Norman Mu
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
ODL
22
8
0
04 Dec 2018
Logit Pairing Methods Can Fool Gradient-Based Attacks
Marius Mosbach
Maksym Andriushchenko
T. A. Trost
Matthias Hein
Dietrich Klakow
AAML
19
82
0
29 Oct 2018
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
Charles H. Martin
Michael W. Mahoney
AI4CE
30
190
0
02 Oct 2018
Don't Use Large Mini-Batches, Use Local SGD
Tao R. Lin
Sebastian U. Stich
Kumar Kshitij Patel
Martin Jaggi
34
429
0
22 Aug 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
281
2,888
0
15 Sep 2016
Adversarial examples in the physical world
Alexey Kurakin
Ian Goodfellow
Samy Bengio
SILM
AAML
281
5,835
0
08 Jul 2016
1