ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.07145
  4. Cited By
PyHessian: Neural Networks Through the Lens of the Hessian

PyHessian: Neural Networks Through the Lens of the Hessian

16 December 2019
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
    ODL
ArXivPDFHTML

Papers citing "PyHessian: Neural Networks Through the Lens of the Hessian"

50 / 66 papers shown
Title
Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training
Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training
Qitao Tan
Sung-En Chang
Rui Xia
Huidong Ji
Chence Yang
...
Zheng Zhan
Zhou Zou
Y. Wang
Jin Lu
Geng Yuan
41
0
0
28 Apr 2025
The effect of the number of parameters and the number of local feature patches on loss landscapes in distributed quantum neural networks
The effect of the number of parameters and the number of local feature patches on loss landscapes in distributed quantum neural networks
Yoshiaki Kawase
71
0
0
27 Apr 2025
Modes of Sequence Models and Learning Coefficients
Modes of Sequence Models and Learning Coefficients
Zhongtian Chen
Daniel Murfet
82
1
0
25 Apr 2025
Layer-wise Adaptive Gradient Norm Penalizing Method for Efficient and Accurate Deep Learning
Layer-wise Adaptive Gradient Norm Penalizing Method for Efficient and Accurate Deep Learning
Sunwoo Lee
98
0
0
18 Mar 2025
Unlocking the Potential of Unlabeled Data in Semi-Supervised Domain Generalization
Unlocking the Potential of Unlabeled Data in Semi-Supervised Domain Generalization
Dongkwan Lee
Kyomin Hwang
Nojun Kwak
94
0
0
18 Mar 2025
Position: Curvature Matrices Should Be Democratized via Linear Operators
Position: Curvature Matrices Should Be Democratized via Linear Operators
Felix Dangel
Runa Eschenhagen
Weronika Ormaniec
Andres Fernandez
Lukas Tatzel
Agustinus Kristiadi
53
3
0
31 Jan 2025
A Hessian-informed hyperparameter optimization for differential learning rate
A Hessian-informed hyperparameter optimization for differential learning rate
Shiyun Xu
Zhiqi Bu
Yiliang Zhang
Ian J. Barnett
39
1
0
12 Jan 2025
Meta Curvature-Aware Minimization for Domain Generalization
Meta Curvature-Aware Minimization for Domain Generalization
Z. Chen
Yiwen Ye
Feilong Tang
Yongsheng Pan
Yong-quan Xia
BDL
173
1
0
16 Dec 2024
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis
Zhijie Chen
Qiaobo Li
A. Banerjee
FedML
30
0
0
11 Nov 2024
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Jim Zhao
Sidak Pal Singh
Aurélien Lucchi
AI4CE
39
0
0
04 Nov 2024
Scaling laws for post-training quantized large language models
Scaling laws for post-training quantized large language models
Zifei Xu
Alexander Lan
W. Yazar
T. Webb
Sayeh Sharify
Xin Eric Wang
MQ
26
0
0
15 Oct 2024
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
Weronika Ormaniec
Felix Dangel
Sidak Pal Singh
33
6
0
14 Oct 2024
Fast Training of Sinusoidal Neural Fields via Scaling Initialization
Fast Training of Sinusoidal Neural Fields via Scaling Initialization
Taesun Yeom
Sangyoon Lee
Jaeho Lee
53
2
0
07 Oct 2024
The Optimization Landscape of SGD Across the Feature Learning Strength
The Optimization Landscape of SGD Across the Feature Learning Strength
Alexander B. Atanasov
Alexandru Meterez
James B. Simon
C. Pehlevan
43
2
0
06 Oct 2024
Just How Flexible are Neural Networks in Practice?
Just How Flexible are Neural Networks in Practice?
Ravid Shwartz-Ziv
Micah Goldblum
Arpit Bansal
C. B. Bruss
Yann LeCun
Andrew Gordon Wilson
35
4
0
17 Jun 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
44
0
0
11 Jun 2024
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Zhe Li
Bicheng Ying
Zidong Liu
Haibo Yang
Haibo Yang
FedML
59
3
0
24 May 2024
Leveraging the Human Ventral Visual Stream to Improve Neural Network
  Robustness
Leveraging the Human Ventral Visual Stream to Improve Neural Network Robustness
Zhenan Shao
Linjian Ma
Bo Li
Diane M. Beck
AAML
33
3
0
04 May 2024
Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization
Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization
S. Reifenstein
T. Leleu
Yoshihisa Yamamoto
35
1
0
02 May 2024
Q-Newton: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent
Q-Newton: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent
Pingzhi Li
Junyu Liu
Hanrui Wang
Tianlong Chen
81
1
0
30 Apr 2024
Statistical Mechanics and Artificial Neural Networks: Principles,
  Models, and Applications
Statistical Mechanics and Artificial Neural Networks: Principles, Models, and Applications
Lucas Böttcher
Gregory R. Wheeler
32
0
0
05 Apr 2024
The Expected Loss of Preconditioned Langevin Dynamics Reveals the
  Hessian Rank
The Expected Loss of Preconditioned Langevin Dynamics Reveals the Hessian Rank
Amitay Bar
Rotem Mulayoff
T. Michaeli
Ronen Talmon
54
0
0
21 Feb 2024
FedSoup: Improving Generalization and Personalization in Federated
  Learning via Selective Model Interpolation
FedSoup: Improving Generalization and Personalization in Federated Learning via Selective Model Interpolation
Minghui Chen
Meirui Jiang
Qianming Dou
Zehua Wang
Xiaoxiao Li
FedML
30
15
0
20 Jul 2023
The Interpolating Information Criterion for Overparameterized Models
The Interpolating Information Criterion for Overparameterized Models
Liam Hodgkinson
Christopher van der Heide
Roberto Salomone
Fred Roosta
Michael W. Mahoney
16
7
0
15 Jul 2023
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model
  Pre-training
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu
Zhiyuan Li
David Leo Wright Hall
Percy Liang
Tengyu Ma
VLM
21
128
0
23 May 2023
GeNAS: Neural Architecture Search with Better Generalization
GeNAS: Neural Architecture Search with Better Generalization
Joonhyun Jeong
Joonsang Yu
Geondo Park
Dongyoon Han
Y. Yoo
20
4
0
15 May 2023
Hard Sample Matters a Lot in Zero-Shot Quantization
Hard Sample Matters a Lot in Zero-Shot Quantization
Huantong Li
Xiangmiao Wu
Fanbing Lv
Daihai Liao
Thomas H. Li
Yonggang Zhang
Bo Han
Mingkui Tan
MQ
24
20
0
24 Mar 2023
Fast as CHITA: Neural Network Pruning with Combinatorial Optimization
Fast as CHITA: Neural Network Pruning with Combinatorial Optimization
Riade Benbaki
Wenyu Chen
X. Meng
Hussein Hazimeh
Natalia Ponomareva
Zhe Zhao
Rahul Mazumder
13
26
0
28 Feb 2023
Efficient and Effective Methods for Mixed Precision Neural Network
  Quantization for Faster, Energy-efficient Inference
Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference
Deepika Bablani
J. McKinstry
S. K. Esser
R. Appuswamy
D. Modha
MQ
10
4
0
30 Jan 2023
Escaping Saddle Points for Effective Generalization on Class-Imbalanced
  Data
Escaping Saddle Points for Effective Generalization on Class-Imbalanced Data
Harsh Rangwani
Sumukh K Aithal
Mayank Mishra
R. Venkatesh Babu
23
27
0
28 Dec 2022
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of
  SGD via Training Trajectories and via Terminal States
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
Ziqiao Wang
Yongyi Mao
10
10
0
19 Nov 2022
Noise Injection as a Probe of Deep Learning Dynamics
Noise Injection as a Probe of Deep Learning Dynamics
Noam Levi
I. Bloch
M. Freytsis
T. Volansky
32
2
0
24 Oct 2022
Tunable Complexity Benchmarks for Evaluating Physics-Informed Neural
  Networks on Coupled Ordinary Differential Equations
Tunable Complexity Benchmarks for Evaluating Physics-Informed Neural Networks on Coupled Ordinary Differential Equations
Alexander New
B. Eng
A. Timm
A. Gearhart
12
4
0
14 Oct 2022
Understanding Edge-of-Stability Training Dynamics with a Minimalist
  Example
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example
Xingyu Zhu
Zixuan Wang
Xiang Wang
Mo Zhou
Rong Ge
64
35
0
07 Oct 2022
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning
  Ticket's Mask?
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
Mansheej Paul
F. Chen
Brett W. Larsen
Jonathan Frankle
Surya Ganguli
Gintare Karolina Dziugaite
UQCV
25
38
0
06 Oct 2022
Homotopy-based training of NeuralODEs for accurate dynamics discovery
Homotopy-based training of NeuralODEs for accurate dynamics discovery
Joon-Hyuk Ko
Hankyul Koh
Nojun Park
W. Jhe
35
8
0
04 Oct 2022
Optimal Query Complexities for Dynamic Trace Estimation
Optimal Query Complexities for Dynamic Trace Estimation
David P. Woodruff
Fred Zhang
Qiuyi Zhang
29
4
0
30 Sep 2022
LGV: Boosting Adversarial Example Transferability from Large Geometric
  Vicinity
LGV: Boosting Adversarial Example Transferability from Large Geometric Vicinity
Martin Gubri
Maxime Cordy
Mike Papadakis
Yves Le Traon
Koushik Sen
AAML
22
51
0
26 Jul 2022
Where to Begin? On the Impact of Pre-Training and Initialization in
  Federated Learning
Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning
John Nguyen
Jianyu Wang
Kshitiz Malik
Maziar Sanjabi
Michael G. Rabbat
FedML
AI4CE
21
21
0
30 Jun 2022
A Closer Look at Smoothness in Domain Adversarial Training
A Closer Look at Smoothness in Domain Adversarial Training
Harsh Rangwani
Sumukh K Aithal
Mayank Mishra
Arihant Jain
R. Venkatesh Babu
25
119
0
16 Jun 2022
Neurotoxin: Durable Backdoors in Federated Learning
Neurotoxin: Durable Backdoors in Federated Learning
Zhengming Zhang
Ashwinee Panda
Linyue Song
Yaoqing Yang
Michael W. Mahoney
Joseph E. Gonzalez
Kannan Ramchandran
Prateek Mittal
FedML
8
129
0
12 Jun 2022
Explaining the physics of transfer learning a data-driven subgrid-scale
  closure to a different turbulent flow
Explaining the physics of transfer learning a data-driven subgrid-scale closure to a different turbulent flow
Adam Subel
Yifei Guan
A. Chattopadhyay
P. Hassanzadeh
AI4CE
27
41
0
07 Jun 2022
Lagrangian PINNs: A causality-conforming solution to failure modes of
  physics-informed neural networks
Lagrangian PINNs: A causality-conforming solution to failure modes of physics-informed neural networks
R. Mojgani
Maciej Balajewicz
P. Hassanzadeh
PINN
23
45
0
05 May 2022
A Local Convergence Theory for the Stochastic Gradient Descent Method in
  Non-Convex Optimization With Non-isolated Local Minima
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima
Tae-Eon Ko
Xiantao Li
20
2
0
21 Mar 2022
Neuromorphic Data Augmentation for Training Spiking Neural Networks
Neuromorphic Data Augmentation for Training Spiking Neural Networks
Yuhang Li
Youngeun Kim
Hyoungseob Park
Tamar Geller
Priyadarshini Panda
31
75
0
11 Mar 2022
The rise of the lottery heroes: why zero-shot pruning is hard
The rise of the lottery heroes: why zero-shot pruning is hard
Enzo Tartaglione
21
6
0
24 Feb 2022
How Do Vision Transformers Work?
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
25
465
0
14 Feb 2022
Anticorrelated Noise Injection for Improved Generalization
Anticorrelated Noise Injection for Improved Generalization
Antonio Orvieto
Hans Kersting
F. Proske
Francis R. Bach
Aurélien Lucchi
53
44
0
06 Feb 2022
When Do Flat Minima Optimizers Work?
When Do Flat Minima Optimizers Work?
Jean Kaddour
Linqing Liu
Ricardo M. A. Silva
Matt J. Kusner
ODL
11
58
0
01 Feb 2022
On the Power-Law Hessian Spectrums in Deep Learning
On the Power-Law Hessian Spectrums in Deep Learning
Zeke Xie
Qian-Yuan Tang
Yunfeng Cai
Mingming Sun
P. Li
ODL
42
8
0
31 Jan 2022
12
Next