A Convergence Theory for Deep Learning via Over-Parameterization

9 November 2018

Papers citing "A Convergence Theory for Deep Learning via Over-Parameterization"

50 / 370 papers shown

Title
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks Zhengdao Chen Eric Vanden-Eijnden Joan Bruna MLT 27 5 0 28 Oct 2022
Sparsity in Continuous-Depth Neural Networks H. Aliee Till Richter Mikhail Solonin I. Ibarra Fabian J. Theis Niki Kilbertus 29 10 0 26 Oct 2022
Optimization for Amortized Inverse Problems Tianci Liu Tong Yang Quan Zhang Qi Lei 38 5 0 25 Oct 2022
Evolution of Neural Tangent Kernels under Benign and Adversarial Training Noel Loo Ramin Hasani Alexander Amini Daniela Rus AAML 36 13 0 21 Oct 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work Jiawei Zhang Yushun Zhang Mingyi Hong Ruoyu Sun Zhi-Quan Luo 29 10 0 21 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets Pulkit Gopalani Anirbit Mukherjee 26 5 0 20 Oct 2022
Theoretical Guarantees for Permutation-Equivariant Quantum Neural Networks Louis Schatzki Martín Larocca Quynh T. Nguyen F. Sauvage M. Cerezo 44 85 0 18 Oct 2022
Review Learning: Alleviating Catastrophic Forgetting with Generative Replay without Generator Jaesung Yoo Sung-Hyuk Choi Yewon Yang Suhyeon Kim J. Choi ... H. J. Joo Dae-Jung Kim R. Park Hyeong-Jin Yoon Kwangsoo Kim KELM OffRL 40 0 0 17 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent Satyen Kale Jason D. Lee Chris De Sa Ayush Sekhari Karthik Sridharan 29 4 0 13 Oct 2022
Towards Theoretically Inspired Neural Initialization Optimization Yibo Yang Hong Wang Haobo Yuan Zhouchen Lin 29 9 0 12 Oct 2022
On skip connections and normalisation layers in deep optimisation L. MacDonald Jack Valmadre Hemanth Saratchandran Simon Lucey ODL 32 1 0 10 Oct 2022
Stability Analysis and Generalization Bounds of Adversarial Training Jiancong Xiao Yanbo Fan Ruoyu Sun Jue Wang Zhimin Luo AAML 32 30 0 03 Oct 2022
Adaptive Smoothness-weighted Adversarial Training for Multiple Perturbations with Its Stability Analysis Jiancong Xiao Zeyu Qin Yanbo Fan Baoyuan Wu Jue Wang Zhimin Luo AAML 34 7 0 02 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition Jianhao Ma Li-Zhen Guo S. Fattahi 41 4 0 01 Oct 2022
Rethinking skip connection model as a learnable Markov chain Dengsheng Chen Jie Hu Wenwen Qiang Xiaoming Wei Enhua Wu BDL 27 1 0 30 Sep 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD Alireza Mousavi-Hosseini Sejun Park M. Girotti Ioannis Mitliagkas Murat A. Erdogdu MLT 324 48 0 29 Sep 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons Sangmin Lee Byeongsu Sim Jong Chul Ye MLT 96 6 0 27 Sep 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty Thomas George Guillaume Lajoie A. Baratin 31 5 0 19 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in $1d$ R. Gentile G. Welper ODL 56 6 0 17 Sep 2022
Flashlight: Scalable Link Prediction with Effective Decoders Yiwei Wang Bryan Hooi Yozen Liu Tong Zhao Zhichun Guo Neil Shah BDL 16 6 0 17 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization) Zhenyu Zhu Fanghui Liu Grigorios G. Chrysos V. Cevher 39 19 0 15 Sep 2022
Generalization Properties of NAS under Activation and Skip Connection Search Zhenyu Zhu Fanghui Liu Grigorios G. Chrysos V. Cevher AI4CE 33 15 0 15 Sep 2022
On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent Selina Drews Michael Kohler 30 14 0 30 Aug 2022
Universal Solutions of Feedforward ReLU Networks for Interpolations Changcun Huang 18 2 0 16 Aug 2022
Towards Understanding Mixture of Experts in Deep Learning Zixiang Chen Yihe Deng Yue-bo Wu Quanquan Gu Yuan-Fang Li MLT MoE 42 53 0 04 Aug 2022
Gradient descent provably escapes saddle points in the training of shallow ReLU networks Patrick Cheridito Arnulf Jentzen Florian Rossmannek 36 5 0 03 Aug 2022
BiFeat: Supercharge GNN Training via Graph Feature Quantization Yuxin Ma Ping Gong Jun Yi Z. Yao Cheng-rong Li Yuxiong He Feng Yan GNN 21 6 0 29 Jul 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit Boaz Barak Benjamin L. Edelman Surbhi Goel Sham Kakade Eran Malach Cyril Zhang 39 124 0 18 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent Zhiyuan Li Tianhao Wang Jason D. Lee Sanjeev Arora 45 27 0 08 Jul 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity Jianyi Yang Shaolei Ren 32 3 0 02 Jul 2022
q-Learning in Continuous Time Yanwei Jia X. Zhou OffRL 51 69 0 02 Jul 2022
Neural Networks can Learn Representations with Gradient Descent Alexandru Damian Jason D. Lee Mahdi Soltanolkotabi SSL MLT 25 114 0 30 Jun 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis Alexander Munteanu Simon Omlor Zhao Song David P. Woodruff 33 15 0 26 Jun 2022
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling Jiri Hron Roman Novak Jeffrey Pennington Jascha Narain Sohl-Dickstein UQCV BDL 48 6 0 15 Jun 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction Kaifeng Lyu Zhiyuan Li Sanjeev Arora FAtt 45 71 0 14 Jun 2022
Scaling ResNets in the Large-depth Regime Pierre Marion Adeline Fermanian Gérard Biau Jean-Philippe Vert 26 16 0 14 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms Lam M. Nguyen Trang H. Tran 32 2 0 13 Jun 2022
Wavelet Regularization Benefits Adversarial Training Jun Yan Huilin Yin Xiaoyang Deng Zi-qin Zhao Wancheng Ge Hao Zhang Gerhard Rigoll AAML 19 2 0 08 Jun 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials Eshaan Nichani Yunzhi Bai Jason D. Lee 29 10 0 08 Jun 2022
Non-convex online learning via algorithmic equivalence Udaya Ghai Zhou Lu Elad Hazan 14 8 0 30 May 2022
Global Convergence of Over-parameterized Deep Equilibrium Models Zenan Ling Xingyu Xie Qiuhao Wang Zongpeng Zhang Zhouchen Lin 32 12 0 27 May 2022
Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width Hanxu Zhou Qixuan Zhou Zhenyuan Jin Tao Luo Yaoyu Zhang Zhi-Qin John Xu 25 21 0 24 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture Libin Zhu Chaoyue Liu M. Belkin GNN AI4CE 23 4 0 24 May 2022
Randomly Initialized One-Layer Neural Networks Make Data Linearly Separable Promit Ghosal Srinath Mahankali Yihang Sun MLT 29 4 0 24 May 2022
Gaussian Pre-Activations in Neural Networks: Myth or Reality? Pierre Wolinski Julyan Arbel AI4CE 76 8 0 24 May 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks Blake Bordelon Cengiz Pehlevan MLT 40 77 0 19 May 2022
Robust Deep Neural Network Estimation for Multi-dimensional Functional Data Shuoyang Wang Guanqun Cao 3DPC OOD 27 4 0 19 May 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias Itay Safran Gal Vardi Jason D. Lee MLT 59 23 0 18 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning Zixin Wen Yuanzhi Li SSL 32 34 0 12 May 2022
Bridging Model-based Safety and Model-free Reinforcement Learning through System Identification of Low Dimensional Linear Models Zhongyu Li Jun Zeng A. Thirugnanam Koushil Sreenath 29 16 0 11 May 2022