Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1506.02617
Cited By
Path-SGD: Path-Normalized Optimization in Deep Neural Networks
Neural Information Processing Systems (NeurIPS), 2015
8 June 2015
Behnam Neyshabur
Ruslan Salakhutdinov
Nathan Srebro
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Path-SGD: Path-Normalized Optimization in Deep Neural Networks"
50 / 195 papers shown
Isotropic Curvature Model for Understanding Deep Learning Optimization: Is Gradient Orthogonalization Optimal?
Weijie Su
145
1
0
01 Nov 2025
Closed-form
ℓ
r
\ell_r
ℓ
r
norm scaling with data for overparameterized linear regression and diagonal linear networks under
ℓ
p
\ell_p
ℓ
p
bias
Shuofeng Zhang
A. Louis
212
0
0
25 Sep 2025
Efficiently Seeking Flat Minima for Better Generalization in Fine-Tuning Large Language Models and Beyond
Jiaxin Deng
Qingcheng Zhu
Junbiao Pang
Linlin Yang
Zhongqian Fu
Baochang Zhang
150
0
0
01 Aug 2025
Symmetry in Neural Network Parameter Spaces
Bo Zhao
Robin Walters
Rose Yu
370
8
0
16 Jun 2025
Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings
Evan Markou
Thalaiyasingam Ajanthan
Stephen Gould
314
0
0
10 Jun 2025
Improving Learning to Optimize Using Parameter Symmetries
Guy Zamir
Aryan Dokania
B. Zhao
Rose Yu
317
2
0
21 Apr 2025
The Empirical Impact of Reducing Symmetries on the Performance of Deep Ensembles and MoE
Andrei Chernov
Oleg Novitskij
382
1
0
24 Feb 2025
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Binchi Zhang
Zaiyi Zheng
Zhengzhang Chen
Wenlin Yao
639
6
0
01 Feb 2025
An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models
Neural Information Processing Systems (NeurIPS), 2024
Yunzhe Hu
Difan Zou
Dong Xu
383
3
0
26 Nov 2024
Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant Problems
Neural Information Processing Systems (NeurIPS), 2024
Bingcong Li
Liang Zhang
Niao He
284
9
0
18 Oct 2024
Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Edan Kinderman
Itay Hubara
Haggai Maron
Daniel Soudry
MoMe
364
3
0
02 Oct 2024
Monomial Matrix Group Equivariant Neural Functional Networks
Neural Information Processing Systems (NeurIPS), 2024
Hoang V. Tran
Thieu N. Vo
Tho H. Tran
An T. Nguyen
Tan M. Nguyen
474
13
0
18 Sep 2024
Application of Langevin Dynamics to Advance the Quantum Natural Gradient Optimization Algorithm
Oleksandr Borysenko
Mykhailo Bratchenko
Ilya Lukin
Mykola Luhanko
Ihor Omelchenko
Andrii Sotnikov
Alessandro Lomi
389
0
0
03 Sep 2024
Quantum-secure multiparty deep learning
Kfir Sulimany
S. Vadlamani
R. Hamerly
Prahlad Iyengar
Dirk Englund
261
11
0
10 Aug 2024
Do Sharpness-based Optimizers Improve Generalization in Medical Image Analysis?
IEEE Access (IEEE Access), 2024
Mohamed Hassan
Aleksandar Vakanski
Min Xian
AAML
MedIm
387
3
0
07 Aug 2024
Scale Equivariant Graph Metanetworks
Ioannis Kalogeropoulos
Giorgos Bouritsas
Yannis Panagakis
392
15
0
15 Jun 2024
ReLUs Are Sufficient for Learning Implicit Neural Representations
Joseph Shenouda
Yamin Zhou
Robert D. Nowak
246
7
0
04 Jun 2024
Sparser, Better, Deeper, Stronger: Improving Sparse Training with Exact Orthogonal Initialization
A. Nowak
Lukasz Gniecki
Filip Szatkowski
Jacek Tabor
301
2
0
03 Jun 2024
The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof
Derek Lim
Moe Putterman
Robin Walters
Haggai Maron
Stefanie Jegelka
471
15
0
30 May 2024
Scalable Optimization in the Modular Norm
Neural Information Processing Systems (NeurIPS), 2024
Tim Large
Yang Liu
Minyoung Huh
Hyojin Bahng
Phillip Isola
Jeremy Bernstein
244
30
0
23 May 2024
Hidden Synergy:
L
1
L_1
L
1
Weight Normalization and 1-Path-Norm Regularization
Aditya Biswas
262
1
0
29 Apr 2024
On the Benefits of Over-parameterization for Out-of-Distribution Generalization
Yifan Hao
Yong Lin
Difan Zou
Tong Zhang
OODD
OOD
245
6
0
26 Mar 2024
Boosting Adversarial Training via Fisher-Rao Norm-based Regularization
Xiangyu Yin
Wenjie Ruan
AAML
177
11
0
26 Mar 2024
Understanding the Double Descent Phenomenon in Deep Learning
Marc Lafon
Alexandre Thomas
350
4
0
15 Mar 2024
Level Set Teleportation: An Optimization Perspective
Aaron Mishkin
A. Bietti
Robert Mansel Gower
308
1
0
05 Mar 2024
Fine-tuning with Very Large Dropout
Jianyu Zhang
Léon Bottou
389
9
0
01 Mar 2024
Leveraging PAC-Bayes Theory and Gibbs Distributions for Generalization Bounds with Complexity Measures
Paul Viallard
Rémi Emonet
Amaury Habrard
Emilie Morvant
Valentina Zantedeschi
320
4
0
19 Feb 2024
Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate
Can Jin
Tong Che
Hongwu Peng
Yiyuan Li
Dimitris N. Metaxas
Marco Pavone
335
59
0
05 Feb 2024
Unification of Symmetries Inside Neural Networks: Transformer, Feedforward and Neural ODE
Koji Hashimoto
Yuji Hirono
Akiyoshi Sannai
AI4CE
256
12
0
04 Feb 2024
The Surprising Harmfulness of Benign Overfitting for Adversarial Robustness
Yifan Hao
Tong Zhang
AAML
507
5
0
19 Jan 2024
Applying statistical learning theory to deep learning
Journal of Statistical Mechanics: Theory and Experiment (J. Stat. Mech.), 2023
Cédric Gerbelot
Avetik G. Karagulyan
Stefani Karp
Kavya Ravichandran
Menachem Stern
Nathan Srebro
FedML
248
3
0
26 Nov 2023
Optimization dependent generalization bound for ReLU networks based on sensitivity in the tangent bundle
Dániel Rácz
Mihaly Petreczky
András Csertán
Bálint Daróczy
MLT
231
1
0
26 Oct 2023
A Symmetry-Aware Exploration of Bayesian Neural Network Posteriors
International Conference on Learning Representations (ICLR), 2023
Olivier Laurent
Emanuel Aldea
Gianni Franchi
BDL
UQCV
289
10
0
12 Oct 2023
Deep Neural Networks Tend To Extrapolate Predictably
International Conference on Learning Representations (ICLR), 2023
Katie Kang
Amrith Rajagopal Setlur
Claire Tomlin
Sergey Levine
213
0
0
02 Oct 2023
Fantastic Generalization Measures are Nowhere to be Found
International Conference on Learning Representations (ICLR), 2023
Michael C. Gastpar
Ido Nachum
Jonathan Shafer
T. Weinberger
368
24
0
24 Sep 2023
Weighted variation spaces and approximation by shallow ReLU networks
Applied and Computational Harmonic Analysis (ACHA), 2023
Ronald A. DeVore
Robert D. Nowak
Rahul Parhi
Jonathan W. Siegel
247
6
0
28 Jul 2023
Quantum Machine Learning on Near-Term Quantum Devices: Current State of Supervised and Unsupervised Techniques for Real-World Applications
Physical Review Applied (Phys. Rev. Appl.), 2023
Yaswitha Gujju
A. Matsuo
Raymond H. Putra
451
48
0
03 Jul 2023
Nonparametric regression using over-parameterized shallow ReLU neural networks
Journal of machine learning research (JMLR), 2023
Yunfei Yang
Ding-Xuan Zhou
347
15
0
14 Jun 2023
Hidden symmetries of ReLU networks
International Conference on Machine Learning (ICML), 2023
J. E. Grigsby
Kathryn A. Lindsey
David Rolnick
266
26
0
09 Jun 2023
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
International Conference on Machine Learning (ICML), 2023
Atli Kosson
Bettina Messmer
Martin Jaggi
456
30
0
26 May 2023
Improving Convergence and Generalization Using Parameter Symmetries
International Conference on Learning Representations (ICLR), 2023
Bo Zhao
Robert Mansel Gower
Robin Walters
Rose Yu
MoMe
393
22
0
22 May 2023
Exploring the Complexity of Deep Neural Networks through Functional Equivalence
International Conference on Machine Learning (ICML), 2023
Guohao Shen
372
6
0
19 May 2023
Convergence of stochastic gradient descent under a local Lojasiewicz condition for deep neural networks
Jing An
Jianfeng Lu
197
6
0
18 Apr 2023
Solving Regularized Exp, Cosh and Sinh Regression Problems
Zhihang Li
Zhao Song
Wanrong Zhu
199
41
0
28 Mar 2023
Rethinking White-Box Watermarks on Deep Learning Models under Neural Structural Obfuscation
USENIX Security Symposium (USENIX Security), 2023
Yifan Yan
Xudong Pan
Mi Zhang
Min Yang
AAML
249
27
0
17 Mar 2023
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
Neural Information Processing Systems (NeurIPS), 2023
Agustinus Kristiadi
Felix Dangel
Philipp Hennig
236
17
0
14 Feb 2023
Equivariant Architectures for Learning in Deep Weight Spaces
International Conference on Machine Learning (ICML), 2023
Aviv Navon
Aviv Shamsian
Idan Achituve
Ethan Fetaya
Gal Chechik
Haggai Maron
352
86
0
30 Jan 2023
Quantifying the Impact of Label Noise on Federated Learning
Shuqi Ke
Chao Huang
Xin Liu
FedML
343
8
0
15 Nov 2022
Instance-Dependent Generalization Bounds via Optimal Transport
Journal of machine learning research (JMLR), 2022
Songyan Hou
Parnian Kassraie
Anastasis Kratsios
Andreas Krause
Jonas Rothfuss
500
12
0
02 Nov 2022
Symmetries, flat minima, and the conserved quantities of gradient flow
International Conference on Learning Representations (ICLR), 2022
Bo Zhao
I. Ganev
Robin Walters
Rose Yu
Nima Dehmamy
366
28
0
31 Oct 2022
1
2
3
4
Next