ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.03593
  4. Cited By
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound
v1v2 (latest)

Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound

9 June 2019
Zhao Song
Xin Yang
ArXiv (abs)PDFHTML

Papers citing "Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound"

25 / 75 papers shown
Title
On Convergence and Generalization of Dropout Training
On Convergence and Generalization of Dropout TrainingNeural Information Processing Systems (NeurIPS), 2020
Poorya Mianjy
R. Arora
177
32
0
23 Oct 2020
MixCon: Adjusting the Separability of Data Representations for Harder
  Data Recovery
MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery
Xiaoxiao Li
Yangsibo Huang
Binghui Peng
Zhao Song
Keqin Li
MIACV
128
1
0
22 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum:
  Training a Wide ReLU Network and a Deep Linear Network
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear NetworkInternational Conference on Machine Learning (ICML), 2020
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
301
24
0
04 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes
Deep Equals Shallow for ReLU Networks in Kernel Regimes
A. Bietti
Francis R. Bach
283
93
0
30 Sep 2020
Generalized Leverage Score Sampling for Neural Networks
Generalized Leverage Score Sampling for Neural NetworksNeural Information Processing Systems (NeurIPS), 2020
Jason D. Lee
Ruoqi Shen
Zhao Song
Mengdi Wang
Zheng Yu
130
45
0
21 Sep 2020
Towards an Understanding of Residual Networks Using Neural Tangent
  Hierarchy (NTH)
Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)CSIAM Transactions on Applied Mathematics (CSIAM Trans. Appl. Math.), 2020
Yuqing Li
Yaoyu Zhang
N. Yip
157
5
0
07 Jul 2020
Optimal Rates for Averaged Stochastic Gradient Descent under Neural
  Tangent Kernel Regime
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime
Atsushi Nitanda
Taiji Suzuki
169
44
0
22 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time
Training (Overparametrized) Neural Networks in Near-Linear Time
Jan van den Brand
Binghui Peng
Zhao Song
Omri Weinstein
ODL
147
83
0
20 Jun 2020
Hardness of Learning Neural Networks with Natural Weights
Hardness of Learning Neural Networks with Natural Weights
Amit Daniely
Gal Vardi
164
21
0
05 Jun 2020
Network size and weights size for memorization with two-layers neural
  networks
Network size and weights size for memorization with two-layers neural networks
Sébastien Bubeck
Ronen Eldan
Y. Lee
Dan Mikulincer
139
33
0
04 Jun 2020
Memorizing Gaussians with no over-parameterizaion via gradient decent on
  neural networks
Memorizing Gaussians with no over-parameterizaion via gradient decent on neural networks
Amit Daniely
VLMMLT
80
14
0
28 Mar 2020
Global Convergence of Deep Networks with One Wide Layer Followed by
  Pyramidal Topology
Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal TopologyNeural Information Processing Systems (NeurIPS), 2020
Quynh N. Nguyen
Marco Mondelli
ODLAI4CE
211
78
0
18 Feb 2020
Learning Parities with Neural Networks
Learning Parities with Neural NetworksNeural Information Processing Systems (NeurIPS), 2020
Amit Daniely
Eran Malach
181
86
0
18 Feb 2020
Over-parameterized Adversarial Training: An Analysis Overcoming the
  Curse of Dimensionality
Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of DimensionalityNeural Information Processing Systems (NeurIPS), 2020
Yi Zhang
Orestis Plevrakis
S. Du
Xingguo Li
Zhao Song
Sanjeev Arora
184
55
0
16 Feb 2020
Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent
Training Two-Layer ReLU Networks with Gradient Descent is InconsistentJournal of machine learning research (JMLR), 2020
David Holzmüller
Ingo Steinwart
MLT
126
8
0
12 Feb 2020
A Corrective View of Neural Networks: Representation, Memorization and
  Learning
A Corrective View of Neural Networks: Representation, Memorization and LearningAnnual Conference Computational Learning Theory (COLT), 2020
Guy Bresler
Dheeraj M. Nagaraj
MLT
155
19
0
01 Feb 2020
Memory capacity of neural networks with threshold and ReLU activations
Memory capacity of neural networks with threshold and ReLU activations
Roman Vershynin
111
21
0
20 Jan 2020
Benefits of Jointly Training Autoencoders: An Improved Neural Tangent
  Kernel Analysis
Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel AnalysisIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2019
THANH VAN NGUYEN
Raymond K. W. Wong
Chinmay Hegde
155
14
0
27 Nov 2019
Neural Networks Learning and Memorization with (almost) no
  Over-Parameterization
Neural Networks Learning and Memorization with (almost) no Over-ParameterizationNeural Information Processing Systems (NeurIPS), 2019
Amit Daniely
124
34
0
22 Nov 2019
Quadratic number of nodes is sufficient to learn a dataset via gradient
  descent
Quadratic number of nodes is sufficient to learn a dataset via gradient descent
Biswarup Das
Eugene Golikov
MLT
44
0
0
13 Nov 2019
Nearly Minimal Over-Parametrization of Shallow Neural Networks
Armin Eftekhari
Chaehwan Song
Volkan Cevher
99
1
0
09 Oct 2019
Polylogarithmic width suffices for gradient descent to achieve
  arbitrarily small test error with shallow ReLU networks
Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networksInternational Conference on Learning Representations (ICLR), 2019
Ziwei Ji
Matus Telgarsky
168
185
0
26 Sep 2019
Dynamics of Deep Neural Networks and Neural Tangent Hierarchy
Dynamics of Deep Neural Networks and Neural Tangent HierarchyInternational Conference on Machine Learning (ICML), 2019
Jiaoyang Huang
H. Yau
99
158
0
18 Sep 2019
Gradient Descent Finds Global Minima for Generalizable Deep Neural
  Networks of Practical Sizes
Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical SizesAllerton Conference on Communication, Control, and Computing (Allerton), 2019
Kenji Kawaguchi
Jiaoyang Huang
ODL
132
63
0
05 Aug 2019
Enhancing Adversarial Defense by k-Winners-Take-All
Enhancing Adversarial Defense by k-Winners-Take-AllInternational Conference on Learning Representations (ICLR), 2019
Chang Xiao
Peilin Zhong
Changxi Zheng
AAML
171
107
0
25 May 2019
Previous
12