ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1506.02617
  4. Cited By
Path-SGD: Path-Normalized Optimization in Deep Neural Networks

Path-SGD: Path-Normalized Optimization in Deep Neural Networks

Neural Information Processing Systems (NeurIPS), 2015
8 June 2015
Behnam Neyshabur
Ruslan Salakhutdinov
Nathan Srebro
    ODL
ArXiv (abs)PDFHTML

Papers citing "Path-SGD: Path-Normalized Optimization in Deep Neural Networks"

50 / 195 papers shown
Isotropic Curvature Model for Understanding Deep Learning Optimization: Is Gradient Orthogonalization Optimal?
Isotropic Curvature Model for Understanding Deep Learning Optimization: Is Gradient Orthogonalization Optimal?
Weijie Su
145
1
0
01 Nov 2025
Closed-form $\ell_r$ norm scaling with data for overparameterized linear regression and diagonal linear networks under $\ell_p$ bias
Closed-form ℓr\ell_rℓr​ norm scaling with data for overparameterized linear regression and diagonal linear networks under ℓp\ell_pℓp​ bias
Shuofeng Zhang
A. Louis
212
0
0
25 Sep 2025
Efficiently Seeking Flat Minima for Better Generalization in Fine-Tuning Large Language Models and Beyond
Efficiently Seeking Flat Minima for Better Generalization in Fine-Tuning Large Language Models and Beyond
Jiaxin Deng
Qingcheng Zhu
Junbiao Pang
Linlin Yang
Zhongqian Fu
Baochang Zhang
150
0
0
01 Aug 2025
Symmetry in Neural Network Parameter Spaces
Symmetry in Neural Network Parameter Spaces
Bo Zhao
Robin Walters
Rose Yu
370
8
0
16 Jun 2025
Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings
Evan Markou
Thalaiyasingam Ajanthan
Stephen Gould
314
0
0
10 Jun 2025
Improving Learning to Optimize Using Parameter Symmetries
Improving Learning to Optimize Using Parameter Symmetries
Guy Zamir
Aryan Dokania
B. Zhao
Rose Yu
317
2
0
21 Apr 2025
The Empirical Impact of Reducing Symmetries on the Performance of Deep Ensembles and MoE
The Empirical Impact of Reducing Symmetries on the Performance of Deep Ensembles and MoE
Andrei Chernov
Oleg Novitskij
382
1
0
24 Feb 2025
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Binchi Zhang
Zaiyi Zheng
Zhengzhang Chen
Wenlin Yao
639
6
0
01 Feb 2025
An In-depth Investigation of Sparse Rate Reduction in Transformer-like
  Models
An In-depth Investigation of Sparse Rate Reduction in Transformer-like ModelsNeural Information Processing Systems (NeurIPS), 2024
Yunzhe Hu
Difan Zou
Dong Xu
383
3
0
26 Nov 2024
Implicit Regularization of Sharpness-Aware Minimization for
  Scale-Invariant Problems
Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant ProblemsNeural Information Processing Systems (NeurIPS), 2024
Bingcong Li
Liang Zhang
Niao He
284
9
0
18 Oct 2024
Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Edan Kinderman
Itay Hubara
Haggai Maron
Daniel Soudry
MoMe
364
3
0
02 Oct 2024
Monomial Matrix Group Equivariant Neural Functional Networks
Monomial Matrix Group Equivariant Neural Functional NetworksNeural Information Processing Systems (NeurIPS), 2024
Hoang V. Tran
Thieu N. Vo
Tho H. Tran
An T. Nguyen
Tan M. Nguyen
474
13
0
18 Sep 2024
Application of Langevin Dynamics to Advance the Quantum Natural Gradient Optimization Algorithm
Application of Langevin Dynamics to Advance the Quantum Natural Gradient Optimization Algorithm
Oleksandr Borysenko
Mykhailo Bratchenko
Ilya Lukin
Mykola Luhanko
Ihor Omelchenko
Andrii Sotnikov
Alessandro Lomi
389
0
0
03 Sep 2024
Quantum-secure multiparty deep learning
Quantum-secure multiparty deep learning
Kfir Sulimany
S. Vadlamani
R. Hamerly
Prahlad Iyengar
Dirk Englund
261
11
0
10 Aug 2024
Do Sharpness-based Optimizers Improve Generalization in Medical Image
  Analysis?
Do Sharpness-based Optimizers Improve Generalization in Medical Image Analysis?IEEE Access (IEEE Access), 2024
Mohamed Hassan
Aleksandar Vakanski
Min Xian
AAMLMedIm
387
3
0
07 Aug 2024
Scale Equivariant Graph Metanetworks
Scale Equivariant Graph Metanetworks
Ioannis Kalogeropoulos
Giorgos Bouritsas
Yannis Panagakis
392
15
0
15 Jun 2024
ReLUs Are Sufficient for Learning Implicit Neural Representations
ReLUs Are Sufficient for Learning Implicit Neural Representations
Joseph Shenouda
Yamin Zhou
Robert D. Nowak
246
7
0
04 Jun 2024
Sparser, Better, Deeper, Stronger: Improving Sparse Training with Exact
  Orthogonal Initialization
Sparser, Better, Deeper, Stronger: Improving Sparse Training with Exact Orthogonal Initialization
A. Nowak
Lukasz Gniecki
Filip Szatkowski
Jacek Tabor
301
2
0
03 Jun 2024
The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof
The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof
Derek Lim
Moe Putterman
Robin Walters
Haggai Maron
Stefanie Jegelka
471
15
0
30 May 2024
Scalable Optimization in the Modular Norm
Scalable Optimization in the Modular NormNeural Information Processing Systems (NeurIPS), 2024
Tim Large
Yang Liu
Minyoung Huh
Hyojin Bahng
Phillip Isola
Jeremy Bernstein
244
30
0
23 May 2024
Hidden Synergy: $L_1$ Weight Normalization and 1-Path-Norm
  Regularization
Hidden Synergy: L1L_1L1​ Weight Normalization and 1-Path-Norm Regularization
Aditya Biswas
262
1
0
29 Apr 2024
On the Benefits of Over-parameterization for Out-of-Distribution
  Generalization
On the Benefits of Over-parameterization for Out-of-Distribution Generalization
Yifan Hao
Yong Lin
Difan Zou
Tong Zhang
OODDOOD
245
6
0
26 Mar 2024
Boosting Adversarial Training via Fisher-Rao Norm-based Regularization
Boosting Adversarial Training via Fisher-Rao Norm-based Regularization
Xiangyu Yin
Wenjie Ruan
AAML
177
11
0
26 Mar 2024
Understanding the Double Descent Phenomenon in Deep Learning
Understanding the Double Descent Phenomenon in Deep Learning
Marc Lafon
Alexandre Thomas
350
4
0
15 Mar 2024
Level Set Teleportation: An Optimization Perspective
Level Set Teleportation: An Optimization Perspective
Aaron Mishkin
A. Bietti
Robert Mansel Gower
308
1
0
05 Mar 2024
Fine-tuning with Very Large Dropout
Fine-tuning with Very Large Dropout
Jianyu Zhang
Léon Bottou
389
9
0
01 Mar 2024
Leveraging PAC-Bayes Theory and Gibbs Distributions for Generalization
  Bounds with Complexity Measures
Leveraging PAC-Bayes Theory and Gibbs Distributions for Generalization Bounds with Complexity Measures
Paul Viallard
Rémi Emonet
Amaury Habrard
Emilie Morvant
Valentina Zantedeschi
320
4
0
19 Feb 2024
Learning from Teaching Regularization: Generalizable Correlations Should
  be Easy to Imitate
Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate
Can Jin
Tong Che
Hongwu Peng
Yiyuan Li
Dimitris N. Metaxas
Marco Pavone
335
59
0
05 Feb 2024
Unification of Symmetries Inside Neural Networks: Transformer,
  Feedforward and Neural ODE
Unification of Symmetries Inside Neural Networks: Transformer, Feedforward and Neural ODE
Koji Hashimoto
Yuji Hirono
Akiyoshi Sannai
AI4CE
256
12
0
04 Feb 2024
The Surprising Harmfulness of Benign Overfitting for Adversarial
  Robustness
The Surprising Harmfulness of Benign Overfitting for Adversarial Robustness
Yifan Hao
Tong Zhang
AAML
507
5
0
19 Jan 2024
Applying statistical learning theory to deep learning
Applying statistical learning theory to deep learningJournal of Statistical Mechanics: Theory and Experiment (J. Stat. Mech.), 2023
Cédric Gerbelot
Avetik G. Karagulyan
Stefani Karp
Kavya Ravichandran
Menachem Stern
Nathan Srebro
FedML
248
3
0
26 Nov 2023
Optimization dependent generalization bound for ReLU networks based on
  sensitivity in the tangent bundle
Optimization dependent generalization bound for ReLU networks based on sensitivity in the tangent bundle
Dániel Rácz
Mihaly Petreczky
András Csertán
Bálint Daróczy
MLT
231
1
0
26 Oct 2023
A Symmetry-Aware Exploration of Bayesian Neural Network Posteriors
A Symmetry-Aware Exploration of Bayesian Neural Network PosteriorsInternational Conference on Learning Representations (ICLR), 2023
Olivier Laurent
Emanuel Aldea
Gianni Franchi
BDLUQCV
289
10
0
12 Oct 2023
Deep Neural Networks Tend To Extrapolate Predictably
Deep Neural Networks Tend To Extrapolate PredictablyInternational Conference on Learning Representations (ICLR), 2023
Katie Kang
Amrith Rajagopal Setlur
Claire Tomlin
Sergey Levine
213
0
0
02 Oct 2023
Fantastic Generalization Measures are Nowhere to be Found
Fantastic Generalization Measures are Nowhere to be FoundInternational Conference on Learning Representations (ICLR), 2023
Michael C. Gastpar
Ido Nachum
Jonathan Shafer
T. Weinberger
368
24
0
24 Sep 2023
Weighted variation spaces and approximation by shallow ReLU networks
Weighted variation spaces and approximation by shallow ReLU networksApplied and Computational Harmonic Analysis (ACHA), 2023
Ronald A. DeVore
Robert D. Nowak
Rahul Parhi
Jonathan W. Siegel
247
6
0
28 Jul 2023
Quantum Machine Learning on Near-Term Quantum Devices: Current State of
  Supervised and Unsupervised Techniques for Real-World Applications
Quantum Machine Learning on Near-Term Quantum Devices: Current State of Supervised and Unsupervised Techniques for Real-World ApplicationsPhysical Review Applied (Phys. Rev. Appl.), 2023
Yaswitha Gujju
A. Matsuo
Raymond H. Putra
451
48
0
03 Jul 2023
Nonparametric regression using over-parameterized shallow ReLU neural
  networks
Nonparametric regression using over-parameterized shallow ReLU neural networksJournal of machine learning research (JMLR), 2023
Yunfei Yang
Ding-Xuan Zhou
347
15
0
14 Jun 2023
Hidden symmetries of ReLU networks
Hidden symmetries of ReLU networksInternational Conference on Machine Learning (ICML), 2023
J. E. Grigsby
Kathryn A. Lindsey
David Rolnick
266
26
0
09 Jun 2023
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural
  Networks
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural NetworksInternational Conference on Machine Learning (ICML), 2023
Atli Kosson
Bettina Messmer
Martin Jaggi
456
30
0
26 May 2023
Improving Convergence and Generalization Using Parameter Symmetries
Improving Convergence and Generalization Using Parameter SymmetriesInternational Conference on Learning Representations (ICLR), 2023
Bo Zhao
Robert Mansel Gower
Robin Walters
Rose Yu
MoMe
393
22
0
22 May 2023
Exploring the Complexity of Deep Neural Networks through Functional
  Equivalence
Exploring the Complexity of Deep Neural Networks through Functional EquivalenceInternational Conference on Machine Learning (ICML), 2023
Guohao Shen
372
6
0
19 May 2023
Convergence of stochastic gradient descent under a local Lojasiewicz
  condition for deep neural networks
Convergence of stochastic gradient descent under a local Lojasiewicz condition for deep neural networks
Jing An
Jianfeng Lu
197
6
0
18 Apr 2023
Solving Regularized Exp, Cosh and Sinh Regression Problems
Solving Regularized Exp, Cosh and Sinh Regression Problems
Zhihang Li
Zhao Song
Wanrong Zhu
199
41
0
28 Mar 2023
Rethinking White-Box Watermarks on Deep Learning Models under Neural
  Structural Obfuscation
Rethinking White-Box Watermarks on Deep Learning Models under Neural Structural ObfuscationUSENIX Security Symposium (USENIX Security), 2023
Yifan Yan
Xudong Pan
Mi Zhang
Min Yang
AAML
249
27
0
17 Mar 2023
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
The Geometry of Neural Nets' Parameter Spaces Under ReparametrizationNeural Information Processing Systems (NeurIPS), 2023
Agustinus Kristiadi
Felix Dangel
Philipp Hennig
236
17
0
14 Feb 2023
Equivariant Architectures for Learning in Deep Weight Spaces
Equivariant Architectures for Learning in Deep Weight SpacesInternational Conference on Machine Learning (ICML), 2023
Aviv Navon
Aviv Shamsian
Idan Achituve
Ethan Fetaya
Gal Chechik
Haggai Maron
352
86
0
30 Jan 2023
Quantifying the Impact of Label Noise on Federated Learning
Quantifying the Impact of Label Noise on Federated Learning
Shuqi Ke
Chao Huang
Xin Liu
FedML
343
8
0
15 Nov 2022
Instance-Dependent Generalization Bounds via Optimal Transport
Instance-Dependent Generalization Bounds via Optimal TransportJournal of machine learning research (JMLR), 2022
Songyan Hou
Parnian Kassraie
Anastasis Kratsios
Andreas Krause
Jonas Rothfuss
500
12
0
02 Nov 2022
Symmetries, flat minima, and the conserved quantities of gradient flow
Symmetries, flat minima, and the conserved quantities of gradient flowInternational Conference on Learning Representations (ICLR), 2022
Bo Zhao
I. Ganev
Robin Walters
Rose Yu
Nima Dehmamy
366
28
0
31 Oct 2022
1234
Next