v1v2 (latest)

Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound

9 June 2019

Papers citing "Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound"

25 / 75 papers shown

Title
On Convergence and Generalization of Dropout TrainingNeural Information Processing Systems (NeurIPS), 2020 Poorya Mianjy R. Arora 177 32 0 23 Oct 2020
MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery Xiaoxiao Li Yangsibo Huang Binghui Peng Zhao Song Keqin Li MIACV 128 1 0 22 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear NetworkInternational Conference on Machine Learning (ICML), 2020 Jun-Kun Wang Chi-Heng Lin Jacob D. Abernethy 301 24 0 04 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes A. Bietti Francis R. Bach 283 93 0 30 Sep 2020
Generalized Leverage Score Sampling for Neural NetworksNeural Information Processing Systems (NeurIPS), 2020 Jason D. Lee Ruoqi Shen Zhao Song Mengdi Wang Zheng Yu 130 45 0 21 Sep 2020
Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)CSIAM Transactions on Applied Mathematics (CSIAM Trans. Appl. Math.), 2020 Yuqing Li Yaoyu Zhang N. Yip 157 5 0 07 Jul 2020
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime Atsushi Nitanda Taiji Suzuki 169 44 0 22 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time Jan van den Brand Binghui Peng Zhao Song Omri Weinstein ODL 147 83 0 20 Jun 2020
Hardness of Learning Neural Networks with Natural Weights Amit Daniely Gal Vardi 164 21 0 05 Jun 2020
Network size and weights size for memorization with two-layers neural networks Sébastien Bubeck Ronen Eldan Y. Lee Dan Mikulincer 139 33 0 04 Jun 2020
Memorizing Gaussians with no over-parameterizaion via gradient decent on neural networks Amit Daniely VLM MLT 80 14 0 28 Mar 2020
Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal TopologyNeural Information Processing Systems (NeurIPS), 2020 Quynh N. Nguyen Marco Mondelli ODL AI4CE 211 78 0 18 Feb 2020
Learning Parities with Neural NetworksNeural Information Processing Systems (NeurIPS), 2020 Amit Daniely Eran Malach 181 86 0 18 Feb 2020
Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of DimensionalityNeural Information Processing Systems (NeurIPS), 2020 Yi Zhang Orestis Plevrakis S. Du Xingguo Li Zhao Song Sanjeev Arora 184 55 0 16 Feb 2020
Training Two-Layer ReLU Networks with Gradient Descent is InconsistentJournal of machine learning research (JMLR), 2020 David Holzmüller Ingo Steinwart MLT 126 8 0 12 Feb 2020
A Corrective View of Neural Networks: Representation, Memorization and LearningAnnual Conference Computational Learning Theory (COLT), 2020 Guy Bresler Dheeraj M. Nagaraj MLT 155 19 0 01 Feb 2020
Memory capacity of neural networks with threshold and ReLU activations Roman Vershynin 111 21 0 20 Jan 2020
Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel AnalysisIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2019 THANH VAN NGUYEN Raymond K. W. Wong Chinmay Hegde 155 14 0 27 Nov 2019
Neural Networks Learning and Memorization with (almost) no Over-ParameterizationNeural Information Processing Systems (NeurIPS), 2019 Amit Daniely 124 34 0 22 Nov 2019
Quadratic number of nodes is sufficient to learn a dataset via gradient descent Biswarup Das Eugene Golikov MLT 44 0 0 13 Nov 2019
Nearly Minimal Over-Parametrization of Shallow Neural Networks Armin Eftekhari Chaehwan Song Volkan Cevher 99 1 0 09 Oct 2019
Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networksInternational Conference on Learning Representations (ICLR), 2019 Ziwei Ji Matus Telgarsky 168 185 0 26 Sep 2019
Dynamics of Deep Neural Networks and Neural Tangent HierarchyInternational Conference on Machine Learning (ICML), 2019 Jiaoyang Huang H. Yau 99 158 0 18 Sep 2019
Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical SizesAllerton Conference on Communication, Control, and Computing (Allerton), 2019 Kenji Kawaguchi Jiaoyang Huang ODL 132 63 0 05 Aug 2019
Enhancing Adversarial Defense by k-Winners-Take-AllInternational Conference on Learning Representations (ICLR), 2019 Chang Xiao Peilin Zhong Changxi Zheng AAML 171 107 0 25 May 2019