ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.03962
  4. Cited By
A Convergence Theory for Deep Learning via Over-Parameterization

A Convergence Theory for Deep Learning via Over-Parameterization

9 November 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
    AI4CE
    ODL
ArXivPDFHTML

Papers citing "A Convergence Theory for Deep Learning via Over-Parameterization"

50 / 367 papers shown
Title
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust
  Neural Network Inference
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
Xavier Fischer
AAML
16
2
0
09 Aug 2023
Understanding Deep Neural Networks via Linear Separability of Hidden
  Layers
Understanding Deep Neural Networks via Linear Separability of Hidden Layers
Chao Zhang
Xinyuan Chen
Wensheng Li
Lixue Liu
Wei Wu
Dacheng Tao
28
3
0
26 Jul 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron
  Identification
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification
Lianke Qin
Zhao Song
Yuanyuan Yang
27
9
0
13 Jul 2023
Test-Time Training on Video Streams
Test-Time Training on Video Streams
Renhao Wang
Yu Sun
Yossi Gandelsman
Xinlei Chen
Alexei A. Efros
Alexei A. Efros
Xiaolong Wang
TTA
ViT
3DGS
47
16
0
11 Jul 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural
  Networks
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
Puyu Wang
Yunwen Lei
Di Wang
Yiming Ying
Ding-Xuan Zhou
MLT
29
4
0
26 May 2023
An Analytic End-to-End Deep Learning Algorithm based on Collaborative
  Learning
An Analytic End-to-End Deep Learning Algorithm based on Collaborative Learning
Sitan Li
C. Cheah
8
1
0
26 May 2023
SketchOGD: Memory-Efficient Continual Learning
SketchOGD: Memory-Efficient Continual Learning
Benjamin Wright
Youngjae Min
Jeremy Bernstein
Navid Azizan
CLL
28
0
0
25 May 2023
On the Generalization of Diffusion Model
On the Generalization of Diffusion Model
Mingyang Yi
Jiacheng Sun
Zhenguo Li
30
18
0
24 May 2023
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
Zeyuan Allen-Zhu
Yuanzhi Li
35
16
0
23 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable
  Data
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Hossein Taheri
Christos Thrampoulidis
MLT
16
3
0
22 May 2023
Tight conditions for when the NTK approximation is valid
Tight conditions for when the NTK approximation is valid
Enric Boix-Adserà
Etai Littwin
35
0
0
22 May 2023
Mode Connectivity in Auction Design
Mode Connectivity in Auction Design
Christoph Hertrich
Yixin Tao
László A. Végh
24
1
0
18 May 2023
Residual Prompt Tuning: Improving Prompt Tuning with Residual
  Reparameterization
Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization
Anastasia Razdaibiedina
Yuning Mao
Rui Hou
Madian Khabsa
M. Lewis
Jimmy Ba
Amjad Almahairi
VLM
27
42
0
06 May 2023
Neural Exploitation and Exploration of Contextual Bandits
Neural Exploitation and Exploration of Contextual Bandits
Yikun Ban
Yuchen Yan
A. Banerjee
Jingrui He
42
8
0
05 May 2023
On the Eigenvalue Decay Rates of a Class of Neural-Network Related
  Kernel Functions Defined on General Domains
On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains
Yicheng Li
Zixiong Yu
Y. Cotronis
Qian Lin
55
13
0
04 May 2023
Learning with augmented target information: An alternative theory of
  Feedback Alignment
Learning with augmented target information: An alternative theory of Feedback Alignment
Huzi Cheng
Joshua W. Brown
CVBM
28
0
0
03 Apr 2023
Analyzing Convergence in Quantum Neural Networks: Deviations from Neural
  Tangent Kernels
Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels
Xuchen You
Shouvanik Chakrabarti
Boyang Chen
Xiaodi Wu
39
10
0
26 Mar 2023
Implicit Stochastic Gradient Descent for Training Physics-informed
  Neural Networks
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks
Ye Li
Songcan Chen
Shengyi Huang
PINN
20
3
0
03 Mar 2023
Gauss-Newton Temporal Difference Learning with Nonlinear Function
  Approximation
Gauss-Newton Temporal Difference Learning with Nonlinear Function Approximation
Zhifa Ke
Junyu Zhang
Zaiwen Wen
24
0
0
25 Feb 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function
  Approximation
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation
Thanh Nguyen-Tang
R. Arora
OffRL
46
5
0
24 Feb 2023
PAD: Towards Principled Adversarial Malware Detection Against Evasion
  Attacks
PAD: Towards Principled Adversarial Malware Detection Against Evasion Attacks
Deqiang Li
Shicheng Cui
Yun Li
Jia Xu
Fu Xiao
Shouhuai Xu
AAML
54
18
0
22 Feb 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for
  Learning a Single Neuron
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
37
16
0
20 Feb 2023
Reinforcement Learning with Function Approximation: From Linear to
  Nonlinear
Reinforcement Learning with Function Approximation: From Linear to Nonlinear
Jihao Long
Jiequn Han
27
5
0
20 Feb 2023
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion
  of Spurious Solutions to Strict Saddle Points
Over-parametrization via Lifting for Low-rank Matrix Sensing: Conversion of Spurious Solutions to Strict Saddle Points
Ziye Ma
Igor Molybog
Javad Lavaei
Somayeh Sojoudi
31
3
0
15 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning,
  Generalization, and Sample Complexity
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
Ming Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
37
57
0
12 Feb 2023
On a continuous time model of gradient descent dynamics and instability
  in deep learning
On a continuous time model of gradient descent dynamics and instability in deep learning
Mihaela Rosca
Yan Wu
Chongli Qin
Benoit Dherin
23
7
0
03 Feb 2023
Beyond the Universal Law of Robustness: Sharper Laws for Random Features
  and Neural Tangent Kernels
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels
Simone Bombari
Shayan Kiyani
Marco Mondelli
AAML
46
10
0
03 Feb 2023
A Survey on Efficient Training of Transformers
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
31
47
0
02 Feb 2023
Implicit Regularization Leads to Benign Overfitting for Sparse Linear
  Regression
Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression
Mo Zhou
Rong Ge
37
2
0
01 Feb 2023
A Simple Algorithm For Scaling Up Kernel Methods
A Simple Algorithm For Scaling Up Kernel Methods
Tengyu Xu
Bryan Kelly
Semyon Malamud
21
0
0
26 Jan 2023
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients
Guihong Li
Yuedong Yang
Kartikeya Bhardwaj
R. Marculescu
36
61
0
26 Jan 2023
Learning Gaussian Mixtures Using the Wasserstein-Fisher-Rao Gradient
  Flow
Learning Gaussian Mixtures Using the Wasserstein-Fisher-Rao Gradient Flow
Yuling Yan
Kaizheng Wang
Philippe Rigollet
44
20
0
04 Jan 2023
Sparse neural networks with skip-connections for identification of
  aluminum electrolysis cell
Sparse neural networks with skip-connections for identification of aluminum electrolysis cell
E. Lundby
Haakon Robinson
Adil Rasheed
I. Halvorsen
J. Gravdahl
30
2
0
02 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent
  Variable Models
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
53
11
0
30 Dec 2022
Effects of Data Geometry in Early Deep Learning
Effects of Data Geometry in Early Deep Learning
Saket Tiwari
George Konidaris
82
7
0
29 Dec 2022
Problem-Dependent Power of Quantum Neural Networks on Multi-Class
  Classification
Problem-Dependent Power of Quantum Neural Networks on Multi-Class Classification
Yuxuan Du
Yibo Yang
Dacheng Tao
Min-hsiu Hsieh
48
23
0
29 Dec 2022
Learning Lipschitz Functions by GD-trained Shallow Overparameterized
  ReLU Neural Networks
Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks
Ilja Kuzborskij
Csaba Szepesvári
21
4
0
28 Dec 2022
COLT: Cyclic Overlapping Lottery Tickets for Faster Pruning of Convolutional Neural Networks
COLT: Cyclic Overlapping Lottery Tickets for Faster Pruning of Convolutional Neural Networks
Md. Ismail Hossain
Mohammed Rakib
M. M. L. Elahi
Nabeel Mohammed
Shafin Rahman
21
1
0
24 Dec 2022
Learning threshold neurons via the "edge of stability"
Learning threshold neurons via the "edge of stability"
Kwangjun Ahn
Sébastien Bubeck
Sinho Chewi
Y. Lee
Felipe Suarez
Yi Zhang
MLT
38
36
0
14 Dec 2022
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast
  Evasion of Non-Degenerate Saddle Points
Generalized Gradient Flows with Provable Fixed-Time Convergence and Fast Evasion of Non-Degenerate Saddle Points
Mayank Baranwal
Param Budhraja
V. Raj
A. Hota
33
2
0
07 Dec 2022
Reconstructing Training Data from Model Gradient, Provably
Reconstructing Training Data from Model Gradient, Provably
Zihan Wang
Jason D. Lee
Qi Lei
FedML
32
24
0
07 Dec 2022
Infinite-width limit of deep linear neural networks
Infinite-width limit of deep linear neural networks
Lénaïc Chizat
Maria Colombo
Xavier Fernández-Real
Alessio Figalli
31
14
0
29 Nov 2022
A Kernel Perspective of Skip Connections in Convolutional Networks
A Kernel Perspective of Skip Connections in Convolutional Networks
Daniel Barzilai
Amnon Geifman
Meirav Galun
Ronen Basri
23
12
0
27 Nov 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via
  Weight-Data Correlation Preprocessing
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao Song
Ruizhe Zhang
Danyang Zhuo
77
31
0
25 Nov 2022
Understanding the double descent curve in Machine Learning
Understanding the double descent curve in Machine Learning
Luis Sa-Couto
J. M. Ramos
Miguel Almeida
Andreas Wichert
35
1
0
18 Nov 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion
Characterizing the Spectrum of the NTK via a Power Series Expansion
Michael Murray
Hui Jin
Benjamin Bowman
Guido Montúfar
38
11
0
15 Nov 2022
Cold Start Streaming Learning for Deep Networks
Cold Start Streaming Learning for Deep Networks
Cameron R. Wolfe
Anastasios Kyrillidis
CLL
23
2
0
09 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer
  Neural Networks
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
27
5
0
28 Oct 2022
Sparsity in Continuous-Depth Neural Networks
Sparsity in Continuous-Depth Neural Networks
H. Aliee
Till Richter
Mikhail Solonin
I. Ibarra
Fabian J. Theis
Niki Kilbertus
29
10
0
26 Oct 2022
Optimization for Amortized Inverse Problems
Optimization for Amortized Inverse Problems
Tianci Liu
Tong Yang
Quan Zhang
Qi Lei
38
5
0
25 Oct 2022
Previous
12345678
Next