A Convergence Theory for Deep Learning via Over-Parameterization

9 November 2018

Papers citing "A Convergence Theory for Deep Learning via Over-Parameterization"

50 / 367 papers shown

Title
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference Edouard Yvinec Arnaud Dapogny Kévin Bailly Xavier Fischer AAML 16 2 0 09 Aug 2023
Understanding Deep Neural Networks via Linear Separability of Hidden Layers Chao Zhang Xinyuan Chen Wensheng Li Lixue Liu Wei Wu Dacheng Tao 28 3 0 26 Jul 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification Lianke Qin Zhao Song Yuanyuan Yang 27 9 0 13 Jul 2023
Test-Time Training on Video Streams Renhao Wang Yu Sun Yossi Gandelsman Xinlei Chen Alexei A. Efros Alexei A. Efros Xiaolong Wang TTA ViT 3DGS 47 16 0 11 Jul 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks Puyu Wang Yunwen Lei Di Wang Yiming Ying Ding-Xuan Zhou MLT 29 4 0 26 May 2023
An Analytic End-to-End Deep Learning Algorithm based on Collaborative Learning Sitan Li C. Cheah 8 1 0 26 May 2023
SketchOGD: Memory-Efficient Continual Learning Benjamin Wright Youngjae Min Jeremy Bernstein Navid Azizan CLL 28 0 0 25 May 2023
On the Generalization of Diffusion Model Mingyang Yi Jiacheng Sun Zhenguo Li 30 18 0 24 May 2023
Physics of Language Models: Part 1, Learning Hierarchical Language Structures Zeyuan Allen-Zhu Yuanzhi Li 35 16 0 23 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data Hossein Taheri Christos Thrampoulidis MLT 16 3 0 22 May 2023
Tight conditions for when the NTK approximation is valid Enric Boix-Adserà Etai Littwin 35 0 0 22 May 2023
Mode Connectivity in Auction Design Christoph Hertrich Yixin Tao László A. Végh 24 1 0 18 May 2023
Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization Anastasia Razdaibiedina Yuning Mao Rui Hou Madian Khabsa M. Lewis Jimmy Ba Amjad Almahairi VLM 27 42 0 06 May 2023
Neural Exploitation and Exploration of Contextual Bandits Yikun Ban Yuchen Yan A. Banerjee Jingrui He 42 8 0 05 May 2023
On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains Yicheng Li Zixiong Yu Y. Cotronis Qian Lin 55 13 0 04 May 2023
Learning with augmented target information: An alternative theory of Feedback Alignment Huzi Cheng Joshua W. Brown CVBM 28 0 0 03 Apr 2023
Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels Xuchen You Shouvanik Chakrabarti Boyang Chen Xiaodi Wu 39 10 0 26 Mar 2023
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks Ye Li Songcan Chen Shengyi Huang PINN 20 3 0 03 Mar 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation Zhifa Ke Junyu Zhang Zaiwen Wen 24 0 0 25 Feb 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation Thanh Nguyen-Tang R. Arora OffRL 46 5 0 24 Feb 2023
PAD: Towards Principled Adversarial Malware Detection Against Evasion Attacks Deqiang Li Shicheng Cui Yun Li Jia Xu Fu Xiao Shouhuai Xu AAML 54 18 0 22 Feb 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron Weihang Xu S. Du 37 16 0 20 Feb 2023
Reinforcement Learning with Function Approximation: From Linear to Nonlinear Jihao Long Jiequn Han 27 5 0 20 Feb 2023
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points Ziye Ma Igor Molybog Javad Lavaei Somayeh Sojoudi 31 3 0 15 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity Hongkang Li Ming Wang Sijia Liu Pin-Yu Chen ViT MLT 37 57 0 12 Feb 2023
On a continuous time model of gradient descent dynamics and instability in deep learning Mihaela Rosca Yan Wu Chongli Qin Benoit Dherin 23 7 0 03 Feb 2023
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels Simone Bombari Shayan Kiyani Marco Mondelli AAML 46 10 0 03 Feb 2023
A Survey on Efficient Training of Transformers Bohan Zhuang Jing Liu Zizheng Pan Haoyu He Yuetian Weng Chunhua Shen 31 47 0 02 Feb 2023
Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression Mo Zhou Rong Ge 37 2 0 01 Feb 2023
A Simple Algorithm For Scaling Up Kernel Methods Tengyu Xu Bryan Kelly Semyon Malamud 21 0 0 26 Jan 2023
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients Guihong Li Yuedong Yang Kartikeya Bhardwaj R. Marculescu 36 61 0 26 Jan 2023
Learning Gaussian Mixtures Using the Wasserstein-Fisher-Rao Gradient Flow Yuling Yan Kaizheng Wang Philippe Rigollet 44 20 0 04 Jan 2023
Sparse neural networks with skip-connections for identification of aluminum electrolysis cell E. Lundby Haakon Robinson Adil Rasheed I. Halvorsen J. Gravdahl 30 2 0 02 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models Yufeng Zhang Boyi Liu Qi Cai Lingxiao Wang Zhaoran Wang 53 11 0 30 Dec 2022
Effects of Data Geometry in Early Deep Learning Saket Tiwari George Konidaris 82 7 0 29 Dec 2022
Problem-Dependent Power of Quantum Neural Networks on Multi-Class Classification Yuxuan Du Yibo Yang Dacheng Tao Min-hsiu Hsieh 48 23 0 29 Dec 2022
Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks Ilja Kuzborskij Csaba Szepesvári 21 4 0 28 Dec 2022
COLT: Cyclic Overlapping Lottery Tickets for Faster Pruning of Convolutional Neural Networks Md. Ismail Hossain Mohammed Rakib M. M. L. Elahi Nabeel Mohammed Shafin Rahman 21 1 0 24 Dec 2022
Learning threshold neurons via the "edge of stability" Kwangjun Ahn Sébastien Bubeck Sinho Chewi Y. Lee Felipe Suarez Yi Zhang MLT 38 36 0 14 Dec 2022
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points Mayank Baranwal Param Budhraja V. Raj A. Hota 33 2 0 07 Dec 2022
Reconstructing Training Data from Model Gradient, Provably Zihan Wang Jason D. Lee Qi Lei FedML 32 24 0 07 Dec 2022
Infinite-width limit of deep linear neural networks Lénaïc Chizat Maria Colombo Xavier Fernández-Real Alessio Figalli 31 14 0 29 Nov 2022
A Kernel Perspective of Skip Connections in Convolutional Networks Daniel Barzilai Amnon Geifman Meirav Galun Ronen Basri 23 12 0 27 Nov 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing Josh Alman Jiehao Liang Zhao Song Ruizhe Zhang Danyang Zhuo 77 31 0 25 Nov 2022
Understanding the double descent curve in Machine Learning Luis Sa-Couto J. M. Ramos Miguel Almeida Andreas Wichert 35 1 0 18 Nov 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion Michael Murray Hui Jin Benjamin Bowman Guido Montúfar 38 11 0 15 Nov 2022
Cold Start Streaming Learning for Deep Networks Cameron R. Wolfe Anastasios Kyrillidis CLL 23 2 0 09 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks Zhengdao Chen Eric Vanden-Eijnden Joan Bruna MLT 27 5 0 28 Oct 2022
Sparsity in Continuous-Depth Neural Networks H. Aliee Till Richter Mikhail Solonin I. Ibarra Fabian J. Theis Niki Kilbertus 29 10 0 26 Oct 2022
Optimization for Amortized Inverse Problems Tianci Liu Tong Yang Quan Zhang Qi Lei 38 5 0 25 Oct 2022