Path-SGD: Path-Normalized Optimization in Deep Neural Networks

Neural Information Processing Systems (NeurIPS), 2015

8 June 2015

Papers citing "Path-SGD: Path-Normalized Optimization in Deep Neural Networks"

50 / 195 papers shown

Vision Transformers provably learn spatial structureNeural Information Processing Systems (NeurIPS), 2022

225

102

13 Oct 2022

PathProx: A Proximal Gradient Algorithm for Weight Decay Regularized Deep Neural Networks

Liu Yang

Jifan Zhang

Joseph Shenouda

Dimitris Papailiopoulos

Kangwook Lee

Robert D. Nowak

354

06 Oct 2022

Scale-invariant Bayesian Neural Networks with Connectivity Tangent KernelInternational Conference on Learning Representations (ICLR), 2022

183

30 Sep 2022

Small Transformers Compute Universal Metric EmbeddingsJournal of machine learning research (JMLR), 2022

Anastasis Kratsios

Valentin Debarnot

Ivan Dokmanić

317

14 Sep 2022

Quiver neural networks

I. Ganev

Robin Walters

147

26 Jul 2022

Towards understanding how momentum improves generalization in deep learningInternational Conference on Machine Learning (ICML), 2022

Samy Jelassi

Yuanzhi Li

ODL MLT AI4CE

201

13 Jul 2022

Predicting Out-of-Domain Generalization with Neighborhood Invariance

371

05 Jul 2022

Local Identifiability of Deep ReLU Neural Networks: the TheoryNeural Information Processing Systems (NeurIPS), 2022

Joachim Bona-Pellissier

Franccois Malgouyres

François Bachoc

FAtt

296

15 Jun 2022

Symmetry Teleportation for Accelerated OptimizationNeural Information Processing Systems (NeurIPS), 2022

443

21 May 2022

Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a SurveyArtificial Intelligence Review (Artif Intell Rev), 2022

Paul Wimmer

Jens Mehnert

Alexandru Paul Condurache

336

17 May 2022

Cracking White-box DNN Watermarks via Invariant Neuron TransformsKnowledge Discovery and Data Mining (KDD), 2022

148

30 Apr 2022

Higher-Order Generalization Bounds: Learning Deep Probabilistic Programs via PAC-Bayes Objectives

J. Warrell

M. Gerstein

183

30 Mar 2022

A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima

Tae-Eon Ko

Xiantao Li

232

21 Mar 2022

Penalizing Gradient Norm for Efficiently Improving Generalization in Deep LearningInternational Conference on Machine Learning (ICML), 2022

Yang Zhao

Hao Zhang

Xiuyuan Hu

519

153

08 Feb 2022

Training invariances and the low-rank phenomenon: beyond linear networksInternational Conference on Learning Representations (ICLR), 2022

Thien Le

Stefanie Jegelka

254

28 Jan 2022

Approximation bounds for norm constrained neural networks with applications to regression and GANsApplied and Computational Harmonic Analysis (ACHA), 2022

Yuling Jiao

Yang Wang

Yunfei Yang

232

24 Jan 2022

Weight Expansion: A New Perspective on Dropout and Generalization

283

23 Jan 2022

Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization LandscapeInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022

Devansh Bisla

Jing Wang

A. Choromańska

331

20 Jan 2022

Measuring Complexity of Learning Schemes Using Hessian-Schatten Total Variation

Shayan Aziznejad

Joaquim Campos

M. Unser

249

12 Dec 2021

GBK-GNN: Gated Bi-Kernel Graph Neural Networks for Modeling Both Homophily and HeterophilyThe Web Conference (WWW), 2021

496

143

29 Oct 2021

In Search of Probeable Generalization MeasuresInternational Conference on Machine Learning and Applications (ICMLA), 2021

Jonathan Jaegerman

Khalil Damouni

M. M. Ankaralı

Konstantinos N. Plataniotis

158

23 Oct 2021

Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks

Tolga Ergen

Mert Pilanci

441

18 Oct 2021

The Role of Permutation Invariance in Linear Mode Connectivity of Neural NetworksInternational Conference on Learning Representations (ICLR), 2021

605

273

12 Oct 2021

Perturbated Gradients Updating within Unit Space for Deep Learning

378

01 Oct 2021

Understanding neural networks with reproducing kernel Banach spaces

278

20 Sep 2021

Near-Minimax Optimal Estimation With Shallow ReLU Neural Networks

Rahul Parhi

Robert D. Nowak

356

18 Sep 2021

Batch Normalization Preconditioning for Neural Network Training

Susanna Lange

Kyle E. Helfrich

Qiang Ye

210

02 Aug 2021

An Embedding of ReLU Networks and an Analysis of their IdentifiabilityConstructive approximation (Constr. Approx.), 2021

Pierre Stock

Rémi Gribonval

283

20 Jul 2021

Universal approximation and model compression for radial neural networks

I. Ganev

Twan van Laarhoven

Robin Walters

303

06 Jul 2021

Practical Assessment of Generalization Performance Robustness for Deep Networks via Contrastive Examples

Haoyi Xiong

Siyu Huang

116

20 Jun 2021

Solving hybrid machine learning tasks by traversing weight space geodesics

G. Raghavan

Matt Thomson

05 Jun 2021

What Kinds of Functions do Deep Neural Networks Learn? Insights from Variational Spline TheorySIAM Journal on Mathematics of Data Science (SIMODS), 2021

Rahul Parhi

Robert D. Nowak

MLT

412

07 May 2021

Noether's Learning Dynamics: Role of Symmetry Breaking in Neural NetworksNeural Information Processing Systems (NeurIPS), 2021

Hidenori Tanaka

D. Kunin

359

06 May 2021

SGD Implicitly Regularizes Generalization Error

Daniel A. Roberts

MLT

127

10 Apr 2021

Quantitative Performance Assessment of CNN Units via Topological Entropy CalculationInternational Conference on Learning Representations (ICLR), 2021

Yang Zhao

Hao Zhang

253

17 Mar 2021

Weight Rescaling: Effective and Robust Regularization for Deep Neural Networks with Batch Normalization

Yufei Cui

249

06 Feb 2021

Accelerating Training of Batch Normalization: A Manifold PerspectiveConference on Uncertainty in Artificial Intelligence (UAI), 2021

Mingyang Yi

285

08 Jan 2021

The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural NetworksInternational Conference on Machine Learning (ICML), 2020

281

11 Dec 2020

Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics

D. Kunin

Javier Sagastuy-Breña

Surya Ganguli

Daniel L. K. Yamins

Hidenori Tanaka

346

08 Dec 2020

Characterization of Excess Risk for Locally Strongly Convex Population RiskNeural Information Processing Systems (NeurIPS), 2020

Mingyang Yi

Ruoyu Wang

Zhi-Ming Ma

340

04 Dec 2020

Gabriel Gibeau Sanchez

Philippe Spino

Pierre-Marc Jodoin

338

02 Dec 2020

Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the HessianNeural Information Processing Systems (NeurIPS), 2020

188

12 Nov 2020

An Information-Geometric Distance on the Space of TasksInternational Conference on Machine Learning (ICML), 2020

Yansong Gao

Pratik Chaudhari

204

01 Nov 2020

The power of quantum neural networksNature Computational Science (NCS), 2020

528

948

30 Oct 2020

Effective Regularization Through Loss-Function MetalearningIEEE Congress on Evolutionary Computation (CEC), 2020

Santiago Gonzalez

Xin Qiu

Risto Miikkulainen

522

02 Oct 2020

Learning Optimal Representations with the Decodable Information BottleneckNeural Information Processing Systems (NeurIPS), 2020

Yann Dubois

Douwe Kiela

D. Schwab

Ramakrishna Vedantam

247

27 Sep 2020

MSR-DARTS: Minimum Stable Rank of Differentiable Architecture Search

150

19 Sep 2020

Extreme Memorization via Scale of InitializationInternational Conference on Learning Representations (ICLR), 2020

Harsh Mehta

Ashok Cutkosky

Behnam Neyshabur

165

31 Aug 2020

Shallow Univariate ReLu Networks as Splines: Initialization, Loss Surface, Hessian, & Gradient Flow Dynamics

208

04 Aug 2020

The Representation Theory of Neural Networks

M. Armenta

Pierre-Marc Jodoin

338

23 Jul 2020