Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018

Aarti Singh

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 244 papers shown

Title
Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time Zhao-quan Song Licheng Zhang Ruizhe Zhang 23 63 0 14 Dec 2021
SCORE: Approximating Curvature Information under Self-Concordant Regularization Adeyemi Damilare Adeoye Alberto Bemporad 8 4 0 14 Dec 2021
Provable Continual Learning via Sketched Jacobian Approximations Reinhard Heckel CLL 18 9 0 09 Dec 2021
On the Convergence of Shallow Neural Network Training with Randomly Masked Neurons Fangshuo Liao Anastasios Kyrillidis 36 16 0 05 Dec 2021
Fast Graph Neural Tangent Kernel via Kronecker Sketching Shunhua Jiang Yunze Man Zhao-quan Song Zheng Yu Danyang Zhuo 23 6 0 04 Dec 2021
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models Tri Dao Beidi Chen Kaizhao Liang Jiaming Yang Zhao-quan Song Atri Rudra Christopher Ré 25 75 0 30 Nov 2021
Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization Thanh Nguyen-Tang Sunil R. Gupta A. Nguyen Svetha Venkatesh OffRL 14 28 0 27 Nov 2021
Learning with convolution and pooling operations in kernel methods Theodor Misiakiewicz Song Mei MLT 15 29 0 16 Nov 2021
On the Equivalence between Neural Network and Support Vector Machine Yilan Chen Wei Huang Lam M. Nguyen Tsui-Wei Weng AAML 17 18 0 11 Nov 2021
Subquadratic Overparameterization for Shallow Neural Networks Chaehwan Song Ali Ramezani-Kebrya Thomas Pethick Armin Eftekhari V. Cevher 22 32 0 02 Nov 2021
Pro-KD: Progressive Distillation by Following the Footsteps of the Teacher Mehdi Rezagholizadeh A. Jafari Puneeth Salad Pranav Sharma Ali Saheb Pasand A. Ghodsi 71 17 0 16 Oct 2021
Provable Regret Bounds for Deep Online Learning and Control Xinyi Chen Edgar Minasyan Jason D. Lee Elad Hazan 21 6 0 15 Oct 2021
AIR-Net: Adaptive and Implicit Regularization Neural Network for Matrix Completion Zhemin Li Tao Sun Hongxia Wang Bao Wang 42 6 0 12 Oct 2021
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks Shuai Zhang Meng Wang Sijia Liu Pin-Yu Chen Jinjun Xiong UQCV MLT 21 13 0 12 Oct 2021
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations Jiayao Zhang Hua Wang Weijie J. Su 27 7 0 11 Oct 2021
Deep Bayesian inference for seismic imaging with tasks Ali Siahkoohi G. Rizzuti Felix J. Herrmann BDL UQCV 30 21 0 10 Oct 2021
Does Preprocessing Help Training Over-parameterized Neural Networks? Zhao-quan Song Shuo Yang Ruizhe Zhang 27 49 0 09 Oct 2021
New Insights into Graph Convolutional Networks using Neural Tangent Kernels Mahalakshmi Sabanayagam P. Esser D. Ghoshdastidar 21 6 0 08 Oct 2021
On the Global Convergence of Gradient Descent for multi-layer ResNets in the mean-field regime Zhiyan Ding Shi Chen Qin Li S. Wright MLT AI4CE 30 11 0 06 Oct 2021
Efficient and Private Federated Learning with Partially Trainable Networks Hakim Sidahmed Zheng Xu Ankush Garg Yuan Cao Mingqing Chen FedML 49 13 0 06 Oct 2021
Theory of overparametrization in quantum neural networks Martín Larocca Nathan Ju Diego García-Martín Patrick J. Coles M. Cerezo 32 188 0 23 Sep 2021
NASI: Label- and Data-agnostic Neural Architecture Search at Initialization Yao Shu Shaofeng Cai Zhongxiang Dai Beng Chin Ooi K. H. Low 14 43 0 02 Sep 2021
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization Difan Zou Yuan Cao Yuanzhi Li Quanquan Gu MLT AI4CE 41 37 0 25 Aug 2021
Towards General Function Approximation in Zero-Sum Markov Games Baihe Huang Jason D. Lee Zhaoran Wang Zhuoran Yang 25 47 0 30 Jul 2021
Convergence analysis for gradient flows in the training of artificial neural networks with ReLU activation Arnulf Jentzen Adrian Riekert 19 23 0 09 Jul 2021
A Theoretical Analysis of Fine-tuning with Linear Teachers Gal Shachaf Alon Brutzkus Amir Globerson 26 17 0 04 Jul 2021
Random Neural Networks in the Infinite Width Limit as Gaussian Processes Boris Hanin BDL 24 43 0 04 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition Minghao Chen Houwen Peng Jianlong Fu Haibin Ling ViT 36 259 0 01 Jul 2021
Locality defeats the curse of dimensionality in convolutional teacher-student scenarios Alessandro Favero Francesco Cagnetta M. Wyart 22 31 0 16 Jun 2021
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms A. Camuto George Deligiannidis Murat A. Erdogdu Mert Gurbuzbalaban Umut cSimcsekli Lingjiong Zhu 23 29 0 09 Jun 2021
TENGraD: Time-Efficient Natural Gradient Descent with Exact Fisher-Block Inversion Saeed Soori Bugra Can Baourun Mu Mert Gurbuzbalaban M. Dehnavi 11 10 0 07 Jun 2021
Practical Convex Formulation of Robust One-hidden-layer Neural Network Training Yatong Bai Tanmay Gautam Yujie Gai Somayeh Sojoudi AAML 19 3 0 25 May 2021
Global Convergence of Three-layer Neural Networks in the Mean Field Regime H. Pham Phan-Minh Nguyen MLT AI4CE 41 19 0 11 May 2021
FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis Baihe Huang Xiaoxiao Li Zhao-quan Song Xin Yang FedML 23 16 0 11 May 2021
RATT: Leveraging Unlabeled Data to Guarantee Generalization Saurabh Garg Sivaraman Balakrishnan J. Zico Kolter Zachary Chase Lipton 25 29 0 01 May 2021
Generalization Guarantees for Neural Architecture Search with Train-Validation Split Samet Oymak Mingchen Li Mahdi Soltanolkotabi AI4CE OOD 31 13 0 29 Apr 2021
PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with Many Symbols Songlin Yang Yanpeng Zhao Kewei Tu 18 22 0 28 Apr 2021
Understanding Overparameterization in Generative Adversarial Networks Yogesh Balaji M. Sajedi N. Kalibhat Mucong Ding Dominik Stöger Mahdi Soltanolkotabi S. Feizi AI4CE 12 21 0 12 Apr 2021
A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions Arnulf Jentzen Adrian Riekert MLT 32 13 0 01 Apr 2021
The Discovery of Dynamics via Linear Multistep Methods and Deep Learning: Error Estimation Q. Du Yiqi Gu Haizhao Yang Chao Zhou 24 20 0 21 Mar 2021
Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy Lucas Liebenwein Cenk Baykal Brandon Carter David K Gifford Daniela Rus AAML 27 71 0 04 Mar 2021
Experiments with Rich Regime Training for Deep Learning Xinyan Li A. Banerjee 21 2 0 26 Feb 2021
Learning with invariances in random features and kernel models Song Mei Theodor Misiakiewicz Andrea Montanari OOD 46 89 0 25 Feb 2021
On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs) Zhiyuan Li Sadhika Malladi Sanjeev Arora 33 78 0 24 Feb 2021
Convergence rates for gradient descent in the training of overparameterized artificial neural networks with biases Arnulf Jentzen T. Kröger ODL 28 7 0 23 Feb 2021
GIST: Distributed Training for Large-Scale Graph Convolutional Networks Cameron R. Wolfe Jingkang Yang Arindam Chowdhury Chen Dun Artun Bayer Santiago Segarra Anastasios Kyrillidis BDL GNN LRM 41 9 0 20 Feb 2021
A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions Patrick Cheridito Arnulf Jentzen Adrian Riekert Florian Rossmannek 23 24 0 19 Feb 2021
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent Shahar Azulay E. Moroshko Mor Shpigel Nacson Blake E. Woodworth Nathan Srebro Amir Globerson Daniel Soudry AI4CE 25 73 0 19 Feb 2021
Reproducing Activation Function for Deep Learning Senwei Liang Liyao Lyu Chunmei Wang Haizhao Yang 28 21 0 13 Jan 2021
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning Zeyuan Allen-Zhu Yuanzhi Li FedML 16 354 0 17 Dec 2020