ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02054
  4. Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
    MLT
    ODL
ArXivPDFHTML

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 244 papers shown
Title
On the emergence of simplex symmetry in the final and penultimate layers
  of neural network classifiers
On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers
E. Weinan
Stephan Wojtowytsch
23
42
0
10 Dec 2020
Neural collapse with unconstrained features
Neural collapse with unconstrained features
D. Mixon
Hans Parshall
Jianzong Pi
11
114
0
23 Nov 2020
Gradient Starvation: A Learning Proclivity in Neural Networks
Gradient Starvation: A Learning Proclivity in Neural Networks
Mohammad Pezeshki
Sekouba Kaba
Yoshua Bengio
Aaron Courville
Doina Precup
Guillaume Lajoie
MLT
45
257
0
18 Nov 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence
  Guarantee
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
34
121
0
11 Nov 2020
On Function Approximation in Reinforcement Learning: Optimism in the
  Face of Large State Spaces
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang
Chi Jin
Zhaoran Wang
Mengdi Wang
Michael I. Jordan
18
18
0
09 Nov 2020
Federated Knowledge Distillation
Federated Knowledge Distillation
Hyowoon Seo
Jihong Park
Seungeun Oh
M. Bennis
Seong-Lyun Kim
FedML
12
90
0
04 Nov 2020
Are wider nets better given the same number of parameters?
Are wider nets better given the same number of parameters?
A. Golubeva
Behnam Neyshabur
Guy Gur-Ari
13
44
0
27 Oct 2020
An Investigation of how Label Smoothing Affects Generalization
An Investigation of how Label Smoothing Affects Generalization
Blair Chen
Liu Ziyin
Zihao W. Wang
Paul Pu Liang
UQCV
21
17
0
23 Oct 2020
Global optimality of softmax policy gradient with single hidden layer
  neural networks in the mean-field regime
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime
Andrea Agazzi
Jianfeng Lu
13
15
0
22 Oct 2020
Deep Learning is Singular, and That's Good
Deep Learning is Singular, and That's Good
Daniel Murfet
Susan Wei
Mingming Gong
Hui Li
Jesse Gell-Redman
T. Quella
UQCV
16
26
0
22 Oct 2020
Deep Reinforcement Learning for Adaptive Network Slicing in 5G for
  Intelligent Vehicular Systems and Smart Cities
Deep Reinforcement Learning for Adaptive Network Slicing in 5G for Intelligent Vehicular Systems and Smart Cities
A. Nassar
Y. Yilmaz
AI4CE
11
55
0
19 Oct 2020
A Theoretical Analysis of Catastrophic Forgetting through the NTK
  Overlap Matrix
A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix
T. Doan
Mehdi Abbana Bennani
Bogdan Mazoure
Guillaume Rabusseau
Pierre Alquier
CLL
15
80
0
07 Oct 2020
Computational Separation Between Convolutional and Fully-Connected
  Networks
Computational Separation Between Convolutional and Fully-Connected Networks
Eran Malach
Shai Shalev-Shwartz
14
26
0
03 Oct 2020
On the linearity of large non-linear models: when and why the tangent
  kernel is constant
On the linearity of large non-linear models: when and why the tangent kernel is constant
Chaoyue Liu
Libin Zhu
M. Belkin
14
138
0
02 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes
Deep Equals Shallow for ReLU Networks in Kernel Regimes
A. Bietti
Francis R. Bach
12
86
0
30 Sep 2020
Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot
Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot
Jingtong Su
Yihang Chen
Tianle Cai
Tianhao Wu
Ruiqi Gao
Liwei Wang
J. Lee
6
85
0
22 Sep 2020
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS
Lin Chen
Sheng Xu
19
93
0
22 Sep 2020
Generalized Leverage Score Sampling for Neural Networks
Generalized Leverage Score Sampling for Neural Networks
J. Lee
Ruoqi Shen
Zhao-quan Song
Mengdi Wang
Zheng Yu
13
42
0
21 Sep 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network
  Training
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
Tianle Cai
Shengjie Luo
Keyulu Xu
Di He
Tie-Yan Liu
Liwei Wang
GNN
16
158
0
07 Sep 2020
Predicting Training Time Without Training
Predicting Training Time Without Training
L. Zancato
Alessandro Achille
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
18
24
0
28 Aug 2020
Deep Networks and the Multiple Manifold Problem
Deep Networks and the Multiple Manifold Problem
Sam Buchanan
D. Gilboa
John N. Wright
166
39
0
25 Aug 2020
The Interpolation Phase Transition in Neural Networks: Memorization and
  Generalization under Lazy Training
The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training
Andrea Montanari
Yiqiao Zhong
36
95
0
25 Jul 2020
Geometric compression of invariant manifolds in neural nets
Geometric compression of invariant manifolds in neural nets
J. Paccolat
Leonardo Petrini
Mario Geiger
Kevin Tyloo
M. Wyart
MLT
47
34
0
22 Jul 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs
  Training Accuracy
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
E. Moroshko
Suriya Gunasekar
Blake E. Woodworth
J. Lee
Nathan Srebro
Daniel Soudry
27
85
0
13 Jul 2020
Two-Layer Neural Networks for Partial Differential Equations:
  Optimization and Generalization Theory
Two-Layer Neural Networks for Partial Differential Equations: Optimization and Generalization Theory
Tao Luo
Haizhao Yang
13
73
0
28 Jun 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Greg Yang
35
134
0
25 Jun 2020
Generalisation Guarantees for Continual Learning with Orthogonal
  Gradient Descent
Generalisation Guarantees for Continual Learning with Orthogonal Gradient Descent
Mehdi Abbana Bennani
Thang Doan
Masashi Sugiyama
CLL
42
61
0
21 Jun 2020
Fourier Features Let Networks Learn High Frequency Functions in Low
  Dimensional Domains
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains
Matthew Tancik
Pratul P. Srinivasan
B. Mildenhall
Sara Fridovich-Keil
N. Raghavan
Utkarsh Singhal
R. Ramamoorthi
Jonathan T. Barron
Ren Ng
60
2,335
0
18 Jun 2020
Kernel Alignment Risk Estimator: Risk Prediction from Training Data
Kernel Alignment Risk Estimator: Risk Prediction from Training Data
Arthur Jacot
Berfin cSimcsek
Francesco Spadaro
Clément Hongler
Franck Gabriel
17
66
0
17 Jun 2020
Directional Pruning of Deep Neural Networks
Directional Pruning of Deep Neural Networks
Shih-Kang Chao
Zhanyu Wang
Yue Xing
Guang Cheng
ODL
8
33
0
16 Jun 2020
Optimization and Generalization Analysis of Transduction through
  Gradient Boosting and Application to Multi-scale Graph Neural Networks
Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks
Kenta Oono
Taiji Suzuki
AI4CE
31
31
0
15 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep
  neural networks
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
14
37
0
12 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A
  Mean-Field Theory
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OOD
MLT
58
11
0
08 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
45
98
0
05 Jun 2020
Spectra of the Conjugate Kernel and Neural Tangent Kernel for
  linear-width neural networks
Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks
Z. Fan
Zhichao Wang
29
72
0
25 May 2020
Can Shallow Neural Networks Beat the Curse of Dimensionality? A mean
  field training perspective
Can Shallow Neural Networks Beat the Curse of Dimensionality? A mean field training perspective
Stephan Wojtowytsch
E. Weinan
MLT
18
48
0
21 May 2020
Feature Purification: How Adversarial Training Performs Robust Deep
  Learning
Feature Purification: How Adversarial Training Performs Robust Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
MLT
AAML
27
146
0
20 May 2020
Learning the gravitational force law and other analytic functions
Learning the gravitational force law and other analytic functions
Atish Agarwala
Abhimanyu Das
Rina Panigrahy
Qiuyi Zhang
MLT
6
0
0
15 May 2020
Analysis of Knowledge Transfer in Kernel Regime
Analysis of Knowledge Transfer in Kernel Regime
Arman Rahbar
Ashkan Panahi
Chiranjib Bhattacharyya
Devdatt Dubhashi
M. Chehreghani
13
3
0
30 Mar 2020
Symmetry & critical points for a model shallow neural network
Symmetry & critical points for a model shallow neural network
Yossi Arjevani
M. Field
26
13
0
23 Mar 2020
Frequency Bias in Neural Networks for Input of Non-Uniform Density
Frequency Bias in Neural Networks for Input of Non-Uniform Density
Ronen Basri
Meirav Galun
Amnon Geifman
David Jacobs
Yoni Kasten
S. Kritchman
34
182
0
10 Mar 2020
Convex Geometry and Duality of Over-parameterized Neural Networks
Convex Geometry and Duality of Over-parameterized Neural Networks
Tolga Ergen
Mert Pilanci
MLT
26
54
0
25 Feb 2020
Neural Networks are Convex Regularizers: Exact Polynomial-time Convex
  Optimization Formulations for Two-layer Networks
Neural Networks are Convex Regularizers: Exact Polynomial-time Convex Optimization Formulations for Two-layer Networks
Mert Pilanci
Tolga Ergen
13
116
0
24 Feb 2020
An Optimization and Generalization Analysis for Max-Pooling Networks
An Optimization and Generalization Analysis for Max-Pooling Networks
Alon Brutzkus
Amir Globerson
MLT
AI4CE
11
4
0
22 Feb 2020
Generalisation error in learning with random features and the hidden
  manifold model
Generalisation error in learning with random features and the hidden manifold model
Federica Gerace
Bruno Loureiro
Florent Krzakala
M. Mézard
Lenka Zdeborová
25
165
0
21 Feb 2020
Learning Parities with Neural Networks
Learning Parities with Neural Networks
Amit Daniely
Eran Malach
13
76
0
18 Feb 2020
Over-parameterized Adversarial Training: An Analysis Overcoming the
  Curse of Dimensionality
Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality
Yi Zhang
Orestis Plevrakis
S. Du
Xingguo Li
Zhao-quan Song
Sanjeev Arora
21
51
0
16 Feb 2020
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks
  Trained with the Logistic Loss
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss
Lénaïc Chizat
Francis R. Bach
MLT
16
327
0
11 Feb 2020
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
Eran Malach
Gilad Yehudai
Shai Shalev-Shwartz
Ohad Shamir
48
271
0
03 Feb 2020
Memory capacity of neural networks with threshold and ReLU activations
Memory capacity of neural networks with threshold and ReLU activations
Roman Vershynin
21
21
0
20 Jan 2020
Previous
12345
Next