On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport

24 May 2018

Papers citing "On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport"

50 / 161 papers shown

Title
Optimizing full 3D SPARKLING trajectories for high-resolution T2*-weighted Magnetic Resonance Imaging R. ChaithyaG. P. Weiss Guillaume Daval-Frérot Aurélien Massire A. Vignaud P. Ciuciu 6 8 0 06 Aug 2021
Interpolation can hurt robust generalization even when there is no noise Konstantin Donhauser Alexandru cTifrea Michael Aerni Reinhard Heckel Fanny Yang 26 14 0 05 Aug 2021
The loss landscape of deep linear neural networks: a second-order analysis E. M. Achour Franccois Malgouyres Sébastien Gerchinovitz ODL 22 9 0 28 Jul 2021
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II Yossi Arjevani M. Field 28 18 0 21 Jul 2021
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous Diffusion D. Kunin Javier Sagastuy-Breña Lauren Gillespie Eshed Margalit Hidenori Tanaka Surya Ganguli Daniel L. K. Yamins 31 15 0 19 Jul 2021
Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks Carles Domingo-Enrich A. Bietti Marylou Gabrié Joan Bruna Eric Vanden-Eijnden FedML 32 6 0 11 Jul 2021
Convergence analysis for gradient flows in the training of artificial neural networks with ReLU activation Arnulf Jentzen Adrian Riekert 19 23 0 09 Jul 2021
Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction Dominik Stöger Mahdi Soltanolkotabi ODL 31 74 0 28 Jun 2021
Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent Spencer Frei Quanquan Gu 15 25 0 25 Jun 2021
Extracting Global Dynamics of Loss Landscape in Deep Learning Models Mohammed Eslami Hamed Eramian Marcio Gameiro W. Kalies Konstantin Mischaikow 16 1 0 14 Jun 2021
The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective Geoff Pleiss John P. Cunningham 26 24 0 11 Jun 2021
The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization Mufan Bill Li Mihai Nica Daniel M. Roy 23 33 0 07 Jun 2021
Global Convergence of Three-layer Neural Networks in the Mean Field Regime H. Pham Phan-Minh Nguyen MLT AI4CE 41 19 0 11 May 2021
Relative stability toward diffeomorphisms indicates performance in deep nets Leonardo Petrini Alessandro Favero Mario Geiger M. Wyart OOD 28 15 0 06 May 2021
Two-layer neural networks with values in a Banach space Yury Korolev 21 23 0 05 May 2021
A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions Arnulf Jentzen Adrian Riekert MLT 32 13 0 01 Apr 2021
Do Input Gradients Highlight Discriminative Features? Harshay Shah Prateek Jain Praneeth Netrapalli AAML FAtt 21 57 0 25 Feb 2021
Convergence rates for gradient descent in the training of overparameterized artificial neural networks with biases Arnulf Jentzen T. Kröger ODL 28 7 0 23 Feb 2021
A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions Patrick Cheridito Arnulf Jentzen Adrian Riekert Florian Rossmannek 23 24 0 19 Feb 2021
A Priori Generalization Analysis of the Deep Ritz Method for Solving High Dimensional Elliptic Equations Jianfeng Lu Yulong Lu Min Wang 23 37 0 05 Jan 2021
On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers E. Weinan Stephan Wojtowytsch 23 42 0 10 Dec 2020
Align, then memorise: the dynamics of learning with feedback alignment Maria Refinetti Stéphane dÁscoli Ruben Ohana Sebastian Goldt 26 36 0 24 Nov 2020
Neural collapse with unconstrained features D. Mixon Hans Parshall Jianzong Pi 11 114 0 23 Nov 2020
On the Convergence of Gradient Descent in GANs: MMD GAN As a Gradient Flow Youssef Mroueh Truyen V. Nguyen 21 25 0 04 Nov 2020
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime Andrea Agazzi Jianfeng Lu 13 15 0 22 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes A. Bietti Francis R. Bach 12 86 0 30 Sep 2020
Machine Learning and Computational Mathematics Weinan E PINN AI4CE 18 61 0 23 Sep 2020
The Unbalanced Gromov Wasserstein Distance: Conic Formulation and Relaxation Thibault Séjourné François-Xavier Vialard Gabriel Peyré OT 16 67 0 09 Sep 2020
Quantitative Propagation of Chaos for SGD in Wide Neural Networks Valentin De Bortoli Alain Durmus Xavier Fontaine Umut Simsekli 16 25 0 13 Jul 2020
The Gaussian equivalence of generative models for learning with shallow neural networks Sebastian Goldt Bruno Loureiro Galen Reeves Florent Krzakala M. Mézard Lenka Zdeborová BDL 33 100 0 25 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks Patrick Cheridito Arnulf Jentzen Florian Rossmannek 14 37 0 12 Jun 2020
Representation formulas and pointwise properties for Barron functions E. Weinan Stephan Wojtowytsch 20 79 0 10 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory Yufeng Zhang Qi Cai Zhuoran Yang Yongxin Chen Zhaoran Wang OOD MLT 58 11 0 08 Jun 2020
Can Shallow Neural Networks Beat the Curse of Dimensionality? A mean field training perspective Stephan Wojtowytsch E. Weinan MLT 18 48 0 21 May 2020
Symmetry & critical points for a model shallow neural network Yossi Arjevani M. Field 26 13 0 23 Mar 2020
A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth Yiping Lu Chao Ma Yulong Lu Jianfeng Lu Lexing Ying MLT 31 78 0 11 Mar 2020
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss Lénaïc Chizat Francis R. Bach MLT 16 327 0 11 Feb 2020
On the infinite width limit of neural networks with a standard parameterization Jascha Narain Sohl-Dickstein Roman Novak S. Schoenholz Jaehoon Lee 16 47 0 21 Jan 2020
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity Shiyu Liang Ruoyu Sun R. Srikant 25 19 0 31 Dec 2019
Machine Learning from a Continuous Viewpoint E. Weinan Chao Ma Lei Wu 16 102 0 30 Dec 2019
Sinkhorn Divergences for Unbalanced Optimal Transport Thibault Séjourné Jean Feydy Franccois-Xavier Vialard A. Trouvé Gabriel Peyré OT 9 71 0 28 Oct 2019
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks Yu Bai J. Lee 11 116 0 03 Oct 2019
Finite Depth and Width Corrections to the Neural Tangent Kernel Boris Hanin Mihai Nica MDE 14 150 0 13 Sep 2019
The generalization error of random features regression: Precise asymptotics and double descent curve Song Mei Andrea Montanari 39 624 0 14 Aug 2019
Maximum Mean Discrepancy Gradient Flow Michael Arbel Anna Korba Adil Salim A. Gretton 21 158 0 11 Jun 2019
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems Atsushi Nitanda Geoffrey Chinot Taiji Suzuki MLT 8 33 0 23 May 2019
Linearized two-layers neural networks in high dimension Behrooz Ghorbani Song Mei Theodor Misiakiewicz Andrea Montanari MLT 11 241 0 27 Apr 2019
A Selective Overview of Deep Learning Jianqing Fan Cong Ma Yiqiao Zhong BDL VLM 25 136 0 10 Apr 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections E. Weinan Chao Ma Qingcan Wang Lei Wu MLT 21 22 0 10 Apr 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks Mingchen Li Mahdi Soltanolkotabi Samet Oymak NoLa 26 350 0 27 Mar 2019