Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.03962
Cited By
A Convergence Theory for Deep Learning via Over-Parameterization
9 November 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Convergence Theory for Deep Learning via Over-Parameterization"
50 / 367 papers shown
Title
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
Xavier Fischer
AAML
16
2
0
09 Aug 2023
Understanding Deep Neural Networks via Linear Separability of Hidden Layers
Chao Zhang
Xinyuan Chen
Wensheng Li
Lixue Liu
Wei Wu
Dacheng Tao
28
3
0
26 Jul 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification
Lianke Qin
Zhao Song
Yuanyuan Yang
27
9
0
13 Jul 2023
Test-Time Training on Video Streams
Renhao Wang
Yu Sun
Yossi Gandelsman
Xinlei Chen
Alexei A. Efros
Alexei A. Efros
Xiaolong Wang
TTA
ViT
3DGS
47
16
0
11 Jul 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
Puyu Wang
Yunwen Lei
Di Wang
Yiming Ying
Ding-Xuan Zhou
MLT
29
4
0
26 May 2023
An Analytic End-to-End Deep Learning Algorithm based on Collaborative Learning
Sitan Li
C. Cheah
8
1
0
26 May 2023
SketchOGD: Memory-Efficient Continual Learning
Benjamin Wright
Youngjae Min
Jeremy Bernstein
Navid Azizan
CLL
28
0
0
25 May 2023
On the Generalization of Diffusion Model
Mingyang Yi
Jiacheng Sun
Zhenguo Li
30
18
0
24 May 2023
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
Zeyuan Allen-Zhu
Yuanzhi Li
35
16
0
23 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Hossein Taheri
Christos Thrampoulidis
MLT
16
3
0
22 May 2023
Tight conditions for when the NTK approximation is valid
Enric Boix-Adserà
Etai Littwin
35
0
0
22 May 2023
Mode Connectivity in Auction Design
Christoph Hertrich
Yixin Tao
László A. Végh
24
1
0
18 May 2023
Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization
Anastasia Razdaibiedina
Yuning Mao
Rui Hou
Madian Khabsa
M. Lewis
Jimmy Ba
Amjad Almahairi
VLM
27
42
0
06 May 2023
Neural Exploitation and Exploration of Contextual Bandits
Yikun Ban
Yuchen Yan
A. Banerjee
Jingrui He
42
8
0
05 May 2023
On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains
Yicheng Li
Zixiong Yu
Y. Cotronis
Qian Lin
55
13
0
04 May 2023
Learning with augmented target information: An alternative theory of Feedback Alignment
Huzi Cheng
Joshua W. Brown
CVBM
28
0
0
03 Apr 2023
Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels
Xuchen You
Shouvanik Chakrabarti
Boyang Chen
Xiaodi Wu
39
10
0
26 Mar 2023
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks
Ye Li
Songcan Chen
Shengyi Huang
PINN
20
3
0
03 Mar 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
24
0
0
25 Feb 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation
Thanh Nguyen-Tang
R. Arora
OffRL
46
5
0
24 Feb 2023
PAD: Towards Principled Adversarial Malware Detection Against Evasion Attacks
Deqiang Li
Shicheng Cui
Yun Li
Jia Xu
Fu Xiao
Shouhuai Xu
AAML
54
18
0
22 Feb 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
37
16
0
20 Feb 2023
Reinforcement Learning with Function Approximation: From Linear to Nonlinear
Jihao Long
Jiequn Han
27
5
0
20 Feb 2023
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
Ziye Ma
Igor Molybog
Javad Lavaei
Somayeh Sojoudi
31
3
0
15 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
Ming Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
37
57
0
12 Feb 2023
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
23
7
0
03 Feb 2023
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels
Simone Bombari
Shayan Kiyani
Marco Mondelli
AAML
46
10
0
03 Feb 2023
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
31
47
0
02 Feb 2023
Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression
Mo Zhou
Rong Ge
37
2
0
01 Feb 2023
A Simple Algorithm For Scaling Up Kernel Methods
Tengyu Xu
Bryan Kelly
Semyon Malamud
21
0
0
26 Jan 2023
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients
Guihong Li
Yuedong Yang
Kartikeya Bhardwaj
R. Marculescu
36
61
0
26 Jan 2023
Learning Gaussian Mixtures Using the Wasserstein-Fisher-Rao Gradient Flow
Yuling Yan
Kaizheng Wang
Philippe Rigollet
44
20
0
04 Jan 2023
Sparse neural networks with skip-connections for identification of aluminum electrolysis cell
E. Lundby
Haakon Robinson
Adil Rasheed
I. Halvorsen
J. Gravdahl
30
2
0
02 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
53
11
0
30 Dec 2022
Effects of Data Geometry in Early Deep Learning
Saket Tiwari
George Konidaris
82
7
0
29 Dec 2022
Problem-Dependent Power of Quantum Neural Networks on Multi-Class Classification
Yuxuan Du
Yibo Yang
Dacheng Tao
Min-hsiu Hsieh
48
23
0
29 Dec 2022
Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks
Ilja Kuzborskij
Csaba Szepesvári
21
4
0
28 Dec 2022
COLT: Cyclic Overlapping Lottery Tickets for Faster Pruning of Convolutional Neural Networks
Md. Ismail Hossain
Mohammed Rakib
M. M. L. Elahi
Nabeel Mohammed
Shafin Rahman
21
1
0
24 Dec 2022
Learning threshold neurons via the "edge of stability"
Kwangjun Ahn
Sébastien Bubeck
Sinho Chewi
Y. Lee
Felipe Suarez
Yi Zhang
MLT
38
36
0
14 Dec 2022
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points
Mayank Baranwal
Param Budhraja
V. Raj
A. Hota
33
2
0
07 Dec 2022
Reconstructing Training Data from Model Gradient, Provably
Zihan Wang
Jason D. Lee
Qi Lei
FedML
32
24
0
07 Dec 2022
Infinite-width limit of deep linear neural networks
Lénaïc Chizat
Maria Colombo
Xavier Fernández-Real
Alessio Figalli
31
14
0
29 Nov 2022
A Kernel Perspective of Skip Connections in Convolutional Networks
Daniel Barzilai
Amnon Geifman
Meirav Galun
Ronen Basri
23
12
0
27 Nov 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao Song
Ruizhe Zhang
Danyang Zhuo
77
31
0
25 Nov 2022
Understanding the double descent curve in Machine Learning
Luis Sa-Couto
J. M. Ramos
Miguel Almeida
Andreas Wichert
35
1
0
18 Nov 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion
Michael Murray
Hui Jin
Benjamin Bowman
Guido Montúfar
38
11
0
15 Nov 2022
Cold Start Streaming Learning for Deep Networks
Cameron R. Wolfe
Anastasios Kyrillidis
CLL
23
2
0
09 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
27
5
0
28 Oct 2022
Sparsity in Continuous-Depth Neural Networks
H. Aliee
Till Richter
Mikhail Solonin
I. Ibarra
Fabian J. Theis
Niki Kilbertus
29
10
0
26 Oct 2022
Optimization for Amortized Inverse Problems
Tianci Liu
Tong Yang
Quan Zhang
Qi Lei
38
5
0
25 Oct 2022
Previous
1
2
3
4
5
6
7
8
Next