Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.07145
Cited By
PyHessian: Neural Networks Through the Lens of the Hessian
16 December 2019
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PyHessian: Neural Networks Through the Lens of the Hessian"
50 / 66 papers shown
Title
Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training
Qitao Tan
Sung-En Chang
Rui Xia
Huidong Ji
Chence Yang
...
Zheng Zhan
Zhou Zou
Y. Wang
Jin Lu
Geng Yuan
41
0
0
28 Apr 2025
The effect of the number of parameters and the number of local feature patches on loss landscapes in distributed quantum neural networks
Yoshiaki Kawase
71
0
0
27 Apr 2025
Modes of Sequence Models and Learning Coefficients
Zhongtian Chen
Daniel Murfet
82
1
0
25 Apr 2025
Layer-wise Adaptive Gradient Norm Penalizing Method for Efficient and Accurate Deep Learning
Sunwoo Lee
98
0
0
18 Mar 2025
Unlocking the Potential of Unlabeled Data in Semi-Supervised Domain Generalization
Dongkwan Lee
Kyomin Hwang
Nojun Kwak
94
0
0
18 Mar 2025
Position: Curvature Matrices Should Be Democratized via Linear Operators
Felix Dangel
Runa Eschenhagen
Weronika Ormaniec
Andres Fernandez
Lukas Tatzel
Agustinus Kristiadi
53
3
0
31 Jan 2025
A Hessian-informed hyperparameter optimization for differential learning rate
Shiyun Xu
Zhiqi Bu
Yiliang Zhang
Ian J. Barnett
39
1
0
12 Jan 2025
Meta Curvature-Aware Minimization for Domain Generalization
Z. Chen
Yiwen Ye
Feilong Tang
Yongsheng Pan
Yong-quan Xia
BDL
173
1
0
16 Dec 2024
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
Zhijie Chen
Qiaobo Li
A. Banerjee
FedML
30
0
0
11 Nov 2024
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Jim Zhao
Sidak Pal Singh
Aurélien Lucchi
AI4CE
39
0
0
04 Nov 2024
Scaling laws for post-training quantized large language models
Zifei Xu
Alexander Lan
W. Yazar
T. Webb
Sayeh Sharify
Xin Eric Wang
MQ
26
0
0
15 Oct 2024
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
Weronika Ormaniec
Felix Dangel
Sidak Pal Singh
33
6
0
14 Oct 2024
Fast Training of Sinusoidal Neural Fields via Scaling Initialization
Taesun Yeom
Sangyoon Lee
Jaeho Lee
53
2
0
07 Oct 2024
The Optimization Landscape of SGD Across the Feature Learning Strength
Alexander B. Atanasov
Alexandru Meterez
James B. Simon
C. Pehlevan
43
2
0
06 Oct 2024
Just How Flexible are Neural Networks in Practice?
Ravid Shwartz-Ziv
Micah Goldblum
Arpit Bansal
C. B. Bruss
Yann LeCun
Andrew Gordon Wilson
35
4
0
17 Jun 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
44
0
0
11 Jun 2024
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Zhe Li
Bicheng Ying
Zidong Liu
Haibo Yang
Haibo Yang
FedML
59
3
0
24 May 2024
Leveraging the Human Ventral Visual Stream to Improve Neural Network Robustness
Zhenan Shao
Linjian Ma
Bo Li
Diane M. Beck
AAML
33
3
0
04 May 2024
Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization
S. Reifenstein
T. Leleu
Yoshihisa Yamamoto
35
1
0
02 May 2024
Q-Newton: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent
Pingzhi Li
Junyu Liu
Hanrui Wang
Tianlong Chen
81
1
0
30 Apr 2024
Statistical Mechanics and Artificial Neural Networks: Principles, Models, and Applications
Lucas Böttcher
Gregory R. Wheeler
32
0
0
05 Apr 2024
The Expected Loss of Preconditioned Langevin Dynamics Reveals the Hessian Rank
Amitay Bar
Rotem Mulayoff
T. Michaeli
Ronen Talmon
54
0
0
21 Feb 2024
FedSoup: Improving Generalization and Personalization in Federated Learning via Selective Model Interpolation
Minghui Chen
Meirui Jiang
Qianming Dou
Zehua Wang
Xiaoxiao Li
FedML
30
15
0
20 Jul 2023
The Interpolating Information Criterion for Overparameterized Models
Liam Hodgkinson
Christopher van der Heide
Roberto Salomone
Fred Roosta
Michael W. Mahoney
16
7
0
15 Jul 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
21
128
0
23 May 2023
GeNAS: Neural Architecture Search with Better Generalization
Joonhyun Jeong
Joonsang Yu
Geondo Park
Dongyoon Han
Y. Yoo
20
4
0
15 May 2023
Hard Sample Matters a Lot in Zero-Shot Quantization
Huantong Li
Xiangmiao Wu
Fanbing Lv
Daihai Liao
Thomas H. Li
Yonggang Zhang
Bo Han
Mingkui Tan
MQ
24
20
0
24 Mar 2023
Fast as CHITA: Neural Network Pruning with Combinatorial Optimization
Riade Benbaki
Wenyu Chen
X. Meng
Hussein Hazimeh
Natalia Ponomareva
Zhe Zhao
Rahul Mazumder
13
26
0
28 Feb 2023
Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference
Deepika Bablani
J. McKinstry
S. K. Esser
R. Appuswamy
D. Modha
MQ
10
4
0
30 Jan 2023
Escaping Saddle Points for Effective Generalization on Class-Imbalanced Data
Harsh Rangwani
Sumukh K Aithal
Mayank Mishra
R. Venkatesh Babu
23
27
0
28 Dec 2022
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
Ziqiao Wang
Yongyi Mao
10
10
0
19 Nov 2022
Noise Injection as a Probe of Deep Learning Dynamics
Noam Levi
I. Bloch
M. Freytsis
T. Volansky
32
2
0
24 Oct 2022
Tunable Complexity Benchmarks for Evaluating Physics-Informed Neural Networks on Coupled Ordinary Differential Equations
Alexander New
B. Eng
A. Timm
A. Gearhart
12
4
0
14 Oct 2022
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example
Xingyu Zhu
Zixuan Wang
Xiang Wang
Mo Zhou
Rong Ge
64
35
0
07 Oct 2022
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
Mansheej Paul
F. Chen
Brett W. Larsen
Jonathan Frankle
Surya Ganguli
Gintare Karolina Dziugaite
UQCV
25
38
0
06 Oct 2022
Homotopy-based training of NeuralODEs for accurate dynamics discovery
Joon-Hyuk Ko
Hankyul Koh
Nojun Park
W. Jhe
35
8
0
04 Oct 2022
Optimal Query Complexities for Dynamic Trace Estimation
David P. Woodruff
Fred Zhang
Qiuyi Zhang
29
4
0
30 Sep 2022
LGV: Boosting Adversarial Example Transferability from Large Geometric Vicinity
Martin Gubri
Maxime Cordy
Mike Papadakis
Yves Le Traon
Koushik Sen
AAML
22
51
0
26 Jul 2022
Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning
John Nguyen
Jianyu Wang
Kshitiz Malik
Maziar Sanjabi
Michael G. Rabbat
FedML
AI4CE
21
21
0
30 Jun 2022
A Closer Look at Smoothness in Domain Adversarial Training
Harsh Rangwani
Sumukh K Aithal
Mayank Mishra
Arihant Jain
R. Venkatesh Babu
25
119
0
16 Jun 2022
Neurotoxin: Durable Backdoors in Federated Learning
Zhengming Zhang
Ashwinee Panda
Linyue Song
Yaoqing Yang
Michael W. Mahoney
Joseph E. Gonzalez
Kannan Ramchandran
Prateek Mittal
FedML
8
129
0
12 Jun 2022
Explaining the physics of transfer learning a data-driven subgrid-scale closure to a different turbulent flow
Adam Subel
Yifei Guan
A. Chattopadhyay
P. Hassanzadeh
AI4CE
27
41
0
07 Jun 2022
Lagrangian PINNs: A causality-conforming solution to failure modes of physics-informed neural networks
R. Mojgani
Maciej Balajewicz
P. Hassanzadeh
PINN
23
45
0
05 May 2022
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima
Tae-Eon Ko
Xiantao Li
20
2
0
21 Mar 2022
Neuromorphic Data Augmentation for Training Spiking Neural Networks
Yuhang Li
Youngeun Kim
Hyoungseob Park
Tamar Geller
Priyadarshini Panda
31
75
0
11 Mar 2022
The rise of the lottery heroes: why zero-shot pruning is hard
Enzo Tartaglione
21
6
0
24 Feb 2022
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
25
465
0
14 Feb 2022
Anticorrelated Noise Injection for Improved Generalization
Antonio Orvieto
Hans Kersting
F. Proske
Francis R. Bach
Aurélien Lucchi
53
44
0
06 Feb 2022
When Do Flat Minima Optimizers Work?
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
ODL
11
58
0
01 Feb 2022
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
42
8
0
31 Jan 2022
1
2
Next