An Improved Analysis of Training Over-parameterized Deep Neural Networks

11 June 2019

Quanquan Gu

Papers citing "An Improved Analysis of Training Over-parameterized Deep Neural Networks"

43 / 43 papers shown

Title
High-entropy Advantage in Neural Networks' Generalizability Entao Yang X. Zhang Yue Shang Ge Zhang AI4CE 58 0 0 17 Mar 2025
MLPs at the EOC: Dynamics of Feature Learning Dávid Terjék MLT 41 0 0 18 Feb 2025
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits H. Bui Enrique Mallada Anqi Liu 97 0 0 08 Nov 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes Nikita Kiselev Andrey Grabovoy 41 1 0 18 Sep 2024
How to Protect Copyright Data in Optimization of Large Language Models? T. Chu Zhao-quan Song Chiwun Yang 34 29 0 23 Aug 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification Lianke Qin Zhao-quan Song Yuanyuan Yang 22 9 0 13 Jul 2023
Considering Layerwise Importance in the Lottery Ticket Hypothesis Benjamin Vandersmissen José Oramas 15 1 0 22 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity Hongkang Li M. Wang Sijia Liu Pin-Yu Chen ViT MLT 35 56 0 12 Feb 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning François Caron Fadhel Ayed Paul Jung Hoileong Lee Juho Lee Hongseok Yang 62 2 0 02 Feb 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models Yufeng Zhang Boyi Liu Qi Cai Lingxiao Wang Zhaoran Wang 45 11 0 30 Dec 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing Josh Alman Jiehao Liang Zhao-quan Song Ruizhe Zhang Danyang Zhuo 71 31 0 25 Nov 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion Michael Murray Hui Jin Benjamin Bowman Guido Montúfar 30 11 0 15 Nov 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work Jiawei Zhang Yushun Zhang Mingyi Hong Ruoyu Sun Z. Luo 26 10 0 21 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets Pulkit Gopalani Anirbit Mukherjee 20 5 0 20 Oct 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in $1d$ R. Gentile G. Welper ODL 46 6 0 17 Sep 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity Jianyi Yang Shaolei Ren 24 3 0 02 Jul 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis Alexander Munteanu Simon Omlor Zhao-quan Song David P. Woodruff 27 15 0 26 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms Lam M. Nguyen Trang H. Tran 32 2 0 13 Jun 2022
Global Convergence of Over-parameterized Deep Equilibrium Models Zenan Ling Xingyu Xie Qiuhao Wang Zongpeng Zhang Zhouchen Lin 27 12 0 27 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture Libin Zhu Chaoyue Liu M. Belkin GNN AI4CE 15 4 0 24 May 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks Bartlomiej Polaczyk J. Cyranka ODL 33 3 0 28 Jan 2022
A Kernel-Expanded Stochastic Neural Network Y. Sun F. Liang 20 5 0 14 Jan 2022
Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks Benjamin Bowman Guido Montúfar 18 11 0 12 Jan 2022
Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time Zhao-quan Song Licheng Zhang Ruizhe Zhang 23 63 0 14 Dec 2021
Subquadratic Overparameterization for Shallow Neural Networks Chaehwan Song Ali Ramezani-Kebrya Thomas Pethick Armin Eftekhari V. Cevher 24 32 0 02 Nov 2021
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks Shuai Zhang Meng Wang Sijia Liu Pin-Yu Chen Jinjun Xiong UQCV MLT 21 13 0 12 Oct 2021
A global convergence theory for deep ReLU implicit networks via over-parameterization Tianxiang Gao Hailiang Liu Jia Liu Hridesh Rajan Hongyang Gao MLT 23 16 0 11 Oct 2021
Does Preprocessing Help Training Over-parameterized Neural Networks? Zhao-quan Song Shuo Yang Ruizhe Zhang 30 49 0 09 Oct 2021
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization Difan Zou Yuan Cao Yuanzhi Li Quanquan Gu MLT AI4CE 44 37 0 25 Aug 2021
What can linearized neural networks actually say about generalization? Guillermo Ortiz-Jiménez Seyed-Mohsen Moosavi-Dezfooli P. Frossard 21 43 0 12 Jun 2021
Understanding Overparameterization in Generative Adversarial Networks Yogesh Balaji M. Sajedi N. Kalibhat Mucong Ding Dominik Stöger Mahdi Soltanolkotabi S. Feizi AI4CE 14 21 0 12 Apr 2021
Understanding and Increasing Efficiency of Frank-Wolfe Adversarial Training Theodoros Tsiligkaridis Jay Roberts AAML 11 11 0 22 Dec 2020
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders Yibo Jiang C. Pehlevan 11 13 0 30 Jun 2020
Logarithmic Pruning is All You Need Laurent Orseau Marcus Hutter Omar Rivasplata 23 88 0 22 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory Yufeng Zhang Qi Cai Zhuoran Yang Yongxin Chen Zhaoran Wang OOD MLT 63 11 0 08 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning Zeyuan Allen-Zhu Yuanzhi Li MLT AAML 27 146 0 20 May 2020
Learning Parities with Neural Networks Amit Daniely Eran Malach 18 76 0 18 Feb 2020
Convergence of End-to-End Training in Deep Unsupervised Contrastive Learning Zixin Wen SSL 21 2 0 17 Feb 2020
Memory capacity of neural networks with threshold and ReLU activations Roman Vershynin 26 21 0 20 Jan 2020
Distributionally Robust Deep Learning using Hardness Weighted Sampling Lucas Fidon Michael Aertsen Thomas Deprest Doaa Emam Frédéric Guffens ... Andrew Melbourne Sébastien Ourselin Jan Deprest Georg Langs Tom Kamiel Magda Vercauteren OOD 14 10 0 08 Jan 2020
Towards Understanding the Spectral Bias of Deep Learning Yuan Cao Zhiying Fang Yue Wu Ding-Xuan Zhou Quanquan Gu 23 214 0 03 Dec 2019
Neural Contextual Bandits with UCB-based Exploration Dongruo Zhou Lihong Li Quanquan Gu 22 15 0 11 Nov 2019
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems Atsushi Nitanda Geoffrey Chinot Taiji Suzuki MLT 10 33 0 23 May 2019