On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport

24 May 2018

Papers citing "On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport"

50 / 161 papers shown

Title
From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks Luca Arnaboldi Ludovic Stephan Florent Krzakala Bruno Loureiro MLT 30 31 0 12 Feb 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning François Caron Fadhel Ayed Paul Jung Hoileong Lee Juho Lee Hongseok Yang 62 2 0 02 Feb 2023
On adversarial robustness and the use of Wasserstein ascent-descent dynamics to enforce it Camilo A. Garcia Trillos Nicolas García Trillos 16 5 0 09 Jan 2023
Learning Gaussian Mixtures Using the Wasserstein-Fisher-Rao Gradient Flow Yuling Yan Kaizheng Wang Philippe Rigollet 44 20 0 04 Jan 2023
Learning threshold neurons via the "edge of stability" Kwangjun Ahn Sébastien Bubeck Sinho Chewi Y. Lee Felipe Suarez Yi Zhang MLT 33 36 0 14 Dec 2022
Uniform-in-time propagation of chaos for mean field Langevin dynamics Fan Chen Zhenjie Ren Song-bo Wang 43 30 0 06 Dec 2022
Infinite-width limit of deep linear neural networks Lénaïc Chizat Maria Colombo Xavier Fernández-Real Alessio Figalli 31 14 0 29 Nov 2022
Unbalanced Optimal Transport, from Theory to Numerics Thibault Séjourné Gabriel Peyré Franccois-Xavier Vialard OT 25 47 0 16 Nov 2022
Regression as Classification: Influence of Task Formulation on Neural Network Features Lawrence Stewart Francis R. Bach Quentin Berthet Jean-Philippe Vert 27 24 0 10 Nov 2022
Stochastic Mirror Descent in Average Ensemble Models Taylan Kargin Fariborz Salehi B. Hassibi 16 1 0 27 Oct 2022
Proximal Mean Field Learning in Shallow Neural Networks Alexis M. H. Teter Iman Nodozi A. Halder FedML 40 1 0 25 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets Pulkit Gopalani Anirbit Mukherjee 18 5 0 20 Oct 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks Yossi Arjevani M. Field 16 8 0 12 Oct 2022
Meta-Principled Family of Hyperparameter Scaling Strategies Sho Yaida 50 16 0 10 Oct 2022
Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descent Michael Kohler A. Krzyżak 32 10 0 04 Oct 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty Thomas George Guillaume Lajoie A. Baratin 23 5 0 19 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization) Zhenyu Zhu Fanghui Liu Grigorios G. Chrysos V. Cevher 39 19 0 15 Sep 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries Samuel K. Ainsworth J. Hayase S. Srinivasa MoMe 252 313 0 11 Sep 2022
On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent Selina Drews Michael Kohler 25 13 0 30 Aug 2022
Neural Networks can Learn Representations with Gradient Descent Alexandru Damian Jason D. Lee Mahdi Soltanolkotabi SSL MLT 17 112 0 30 Jun 2022
Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation Loucas Pillaud-Vivien J. Reygner Nicolas Flammarion NoLa 31 31 0 20 Jun 2022
Unbiased Estimation using Underdamped Langevin Dynamics Hamza Ruzayqat Neil K. Chada Ajay Jasra 33 4 0 14 Jun 2022
Neural Collapse: A Review on Modelling Principles and Generalization Vignesh Kothapalli 21 71 0 08 Jun 2022
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling Gerard Ben Arous Reza Gheissari Aukosh Jagannath 49 59 0 08 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs Etienne Boursier Loucas Pillaud-Vivien Nicolas Flammarion ODL 19 58 0 02 Jun 2022
Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width Hanxu Zhou Qixuan Zhou Zhenyuan Jin Tao Luo Yaoyu Zhang Zhi-Qin John Xu 22 20 0 24 May 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks Blake Bordelon C. Pehlevan MLT 24 79 0 19 May 2022
Mean-Field Nonparametric Estimation of Interacting Particle Systems Rentian Yao Xiaohui Chen Yun Yang 43 9 0 16 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation Jimmy Ba Murat A. Erdogdu Taiji Suzuki Zhichao Wang Denny Wu Greg Yang MLT 31 121 0 03 May 2022
Convergence of gradient descent for deep neural networks S. Chatterjee ODL 19 20 0 30 Mar 2022
On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes Elvis Dohmatob A. Bietti AAML 21 13 0 22 Mar 2022
Fully-Connected Network on Noncompact Symmetric Space and Ridgelet Transform based on Helgason-Fourier Analysis Sho Sonoda Isao Ishikawa Masahiro Ikeda 19 15 0 03 Mar 2022
A blob method for inhomogeneous diffusion with applications to multi-agent control and sampling Katy Craig Karthik Elamvazhuthi M. Haberland O. Turanova 27 15 0 25 Feb 2022
Provably convergent quasistatic dynamics for mean-field two-player zero-sum games Chao Ma Lexing Ying MLT 24 11 0 15 Feb 2022
Random Feature Amplification: Feature Learning and Generalization in Neural Networks Spencer Frei Niladri S. Chatterji Peter L. Bartlett MLT 30 29 0 15 Feb 2022
Simultaneous Transport Evolution for Minimax Equilibria on Measures Carles Domingo-Enrich Joan Bruna 16 3 0 14 Feb 2022
Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks R. Veiga Ludovic Stephan Bruno Loureiro Florent Krzakala Lenka Zdeborová MLT 10 31 0 01 Feb 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks Bartlomiej Polaczyk J. Cyranka ODL 30 3 0 28 Jan 2022
Convex Analysis of the Mean Field Langevin Dynamics Atsushi Nitanda Denny Wu Taiji Suzuki MLT 59 64 0 25 Jan 2022
Overview frequency principle/spectral bias in deep learning Z. Xu Yaoyu Zhang Tao Luo FaML 25 65 0 19 Jan 2022
Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime B. Kerimkulov J. Leahy David Siska Lukasz Szpruch 22 11 0 18 Jan 2022
Asymptotic properties of one-layer artificial neural networks with sparse connectivity Christian Hirsch Matthias Neumann Volker Schmidt 11 1 0 01 Dec 2021
Embedding Principle: a hierarchical structure of loss landscape of deep neural networks Yaoyu Zhang Yuqing Li Zhongwang Zhang Tao Luo Z. Xu 21 21 0 30 Nov 2021
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks A. Shevchenko Vyacheslav Kungurtsev Marco Mondelli MLT 36 13 0 03 Nov 2021
The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program Yifei Wang Mert Pilanci MLT MDE 47 11 0 13 Oct 2021
Parallel Deep Neural Networks Have Zero Duality Gap Yifei Wang Tolga Ergen Mert Pilanci 79 10 0 13 Oct 2021
AIR-Net: Adaptive and Implicit Regularization Neural Network for Matrix Completion Zhemin Li Tao Sun Hongxia Wang Bao Wang 42 6 0 12 Oct 2021
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks Shuai Zhang Meng Wang Sijia Liu Pin-Yu Chen Jinjun Xiong UQCV MLT 21 13 0 12 Oct 2021
Tighter Sparse Approximation Bounds for ReLU Neural Networks Carles Domingo-Enrich Youssef Mroueh 91 4 0 07 Oct 2021
On the Global Convergence of Gradient Descent for multi-layer ResNets in the mean-field regime Zhiyan Ding Shi Chen Qin Li S. Wright MLT AI4CE 30 11 0 06 Oct 2021