Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1906.03593
Cited By
v1
v2 (latest)
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound
9 June 2019
Zhao Song
Xin Yang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound"
25 / 75 papers shown
Title
On Convergence and Generalization of Dropout Training
Neural Information Processing Systems (NeurIPS), 2020
Poorya Mianjy
R. Arora
177
32
0
23 Oct 2020
MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery
Xiaoxiao Li
Yangsibo Huang
Binghui Peng
Zhao Song
Keqin Li
MIACV
128
1
0
22 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network
International Conference on Machine Learning (ICML), 2020
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
301
24
0
04 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes
A. Bietti
Francis R. Bach
283
93
0
30 Sep 2020
Generalized Leverage Score Sampling for Neural Networks
Neural Information Processing Systems (NeurIPS), 2020
Jason D. Lee
Ruoqi Shen
Zhao Song
Mengdi Wang
Zheng Yu
130
45
0
21 Sep 2020
Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)
CSIAM Transactions on Applied Mathematics (CSIAM Trans. Appl. Math.), 2020
Yuqing Li
Yaoyu Zhang
N. Yip
157
5
0
07 Jul 2020
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime
Atsushi Nitanda
Taiji Suzuki
169
44
0
22 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time
Jan van den Brand
Binghui Peng
Zhao Song
Omri Weinstein
ODL
147
83
0
20 Jun 2020
Hardness of Learning Neural Networks with Natural Weights
Amit Daniely
Gal Vardi
164
21
0
05 Jun 2020
Network size and weights size for memorization with two-layers neural networks
Sébastien Bubeck
Ronen Eldan
Y. Lee
Dan Mikulincer
139
33
0
04 Jun 2020
Memorizing Gaussians with no over-parameterizaion via gradient decent on neural networks
Amit Daniely
VLM
MLT
80
14
0
28 Mar 2020
Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology
Neural Information Processing Systems (NeurIPS), 2020
Quynh N. Nguyen
Marco Mondelli
ODL
AI4CE
211
78
0
18 Feb 2020
Learning Parities with Neural Networks
Neural Information Processing Systems (NeurIPS), 2020
Amit Daniely
Eran Malach
181
86
0
18 Feb 2020
Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality
Neural Information Processing Systems (NeurIPS), 2020
Yi Zhang
Orestis Plevrakis
S. Du
Xingguo Li
Zhao Song
Sanjeev Arora
184
55
0
16 Feb 2020
Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent
Journal of machine learning research (JMLR), 2020
David Holzmüller
Ingo Steinwart
MLT
126
8
0
12 Feb 2020
A Corrective View of Neural Networks: Representation, Memorization and Learning
Annual Conference Computational Learning Theory (COLT), 2020
Guy Bresler
Dheeraj M. Nagaraj
MLT
155
19
0
01 Feb 2020
Memory capacity of neural networks with threshold and ReLU activations
Roman Vershynin
111
21
0
20 Jan 2020
Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel Analysis
IEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2019
THANH VAN NGUYEN
Raymond K. W. Wong
Chinmay Hegde
155
14
0
27 Nov 2019
Neural Networks Learning and Memorization with (almost) no Over-Parameterization
Neural Information Processing Systems (NeurIPS), 2019
Amit Daniely
124
34
0
22 Nov 2019
Quadratic number of nodes is sufficient to learn a dataset via gradient descent
Biswarup Das
Eugene Golikov
MLT
44
0
0
13 Nov 2019
Nearly Minimal Over-Parametrization of Shallow Neural Networks
Armin Eftekhari
Chaehwan Song
Volkan Cevher
99
1
0
09 Oct 2019
Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks
International Conference on Learning Representations (ICLR), 2019
Ziwei Ji
Matus Telgarsky
168
185
0
26 Sep 2019
Dynamics of Deep Neural Networks and Neural Tangent Hierarchy
International Conference on Machine Learning (ICML), 2019
Jiaoyang Huang
H. Yau
99
158
0
18 Sep 2019
Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes
Allerton Conference on Communication, Control, and Computing (Allerton), 2019
Kenji Kawaguchi
Jiaoyang Huang
ODL
132
63
0
05 Aug 2019
Enhancing Adversarial Defense by k-Winners-Take-All
International Conference on Learning Representations (ICLR), 2019
Chang Xiao
Peilin Zhong
Changxi Zheng
AAML
171
107
0
25 May 2019
Previous
1
2