v1v2 (latest)

Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning

2 February 2023

Papers citing "Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning"

33 / 33 papers shown

Title
NTK-Guided Few-Shot Class Incremental Learning Jingren Liu Zhong Ji Yanwei Pang YunLong Yu CLL 237 11 0 19 Mar 2024
Posterior Inference on Shallow Infinitely Wide Bayesian Neural Networks under Weights with Unbounded VarianceConference on Uncertainty in Artificial Intelligence (UAI), 2023 Jorge Loría A. Bhadra UQCV BDL 316 2 0 18 May 2023
Deep neural networks with dependent weights: Gaussian Process mixture limit, heavy tails, sparsity and compressibilityJournal of machine learning research (JMLR), 2022 Hoileong Lee Fadhel Ayed Paul Jung Juho Lee Hongseok Yang François Caron 195 13 0 17 May 2022
On Feature Learning in Neural Networks with Global Convergence GuaranteesInternational Conference on Learning Representations (ICLR), 2022 Zhengdao Chen Eric Vanden-Eijnden Joan Bruna MLT 198 15 0 22 Apr 2022
Random Feature Amplification: Feature Learning and Generalization in Neural NetworksJournal of machine learning research (JMLR), 2022 Spencer Frei Niladri S. Chatterji Peter L. Bartlett MLT 254 31 0 15 Feb 2022
$α$ -Stable convergence of heavy-tailed infinitely-wide neural networks Paul Jung Hoileong Lee Jiho Lee Hongseok Yang 104 7 0 18 Jun 2021
Deep learning: a statistical viewpointActa Numerica (AN), 2021 Peter L. Bartlett Andrea Montanari Alexander Rakhlin 188 314 0 16 Mar 2021
Quantifying the Benefit of Using Differentiable Learning over Tangent KernelsInternational Conference on Machine Learning (ICML), 2021 Eran Malach Pritish Kamath Emmanuel Abbe Nathan Srebro 208 43 0 01 Mar 2021
Large-width functional asymptotics for deep Gaussian neural networksInternational Conference on Learning Representations (ICLR), 2021 Daniele Bracale Stefano Favaro S. Fortini Stefano Peluchetti 139 17 0 20 Feb 2021
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU NetworksInternational Conference on Machine Learning (ICML), 2020 Quynh N. Nguyen Marco Mondelli Guido Montúfar 576 94 0 21 Dec 2020
Phase diagram for two-layer ReLU neural networks at infinite-width limitJournal of machine learning research (JMLR), 2020 Yaoyu Zhang Zhi-Qin John Xu Zheng Ma Yaoyu Zhang 178 70 0 15 Jul 2020
When Do Neural Networks Outperform Kernel Methods?Neural Information Processing Systems (NeurIPS), 2020 Behrooz Ghorbani Song Mei Theodor Misiakiewicz Andrea Montanari 265 199 0 24 Jun 2020
On the Neural Tangent Kernel of Deep Networks with Orthogonal InitializationInternational Joint Conference on Artificial Intelligence (IJCAI), 2020 Wei Huang Weitao Du R. Xu 120 40 0 13 Apr 2020
Stable behaviour of infinitely wide deep neural networksInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020 Stefano Favaro S. Fortini Stefano Peluchetti BDL 149 31 0 01 Mar 2020
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear NetworksInternational Conference on Learning Representations (ICLR), 2020 Wei Hu Lechao Xiao Jeffrey Pennington 165 126 0 16 Jan 2020
Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian ProcessesNeural Information Processing Systems (NeurIPS), 2019 Greg Yang 393 218 0 28 Oct 2019
Dynamics of Deep Neural Networks and Neural Tangent HierarchyInternational Conference on Machine Learning (ICML), 2019 Jiaoyang Huang H. Yau 131 160 0 18 Sep 2019
Kernel and Rich Regimes in Overparametrized ModelsAnnual Conference Computational Learning Theory (COLT), 2019 Blake E. Woodworth Suriya Gunasekar Pedro H. P. Savarese E. Moroshko Itay Golan Jason D. Lee Daniel Soudry Nathan Srebro 299 388 0 13 Jun 2019
An Improved Analysis of Training Over-parameterized Deep Neural NetworksNeural Information Processing Systems (NeurIPS), 2019 Difan Zou Quanquan Gu 141 244 0 11 Jun 2019
On Exact Computation with an Infinitely Wide Neural Net Sanjeev Arora S. Du Wei Hu Zhiyuan Li Ruslan Salakhutdinov Ruosong Wang 533 981 0 26 Apr 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent Jaehoon Lee Lechao Xiao S. Schoenholz Yasaman Bahri Roman Novak Jascha Narain Sohl-Dickstein Jeffrey Pennington 530 1,201 0 18 Feb 2019
Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit Song Mei Theodor Misiakiewicz Andrea Montanari MLT 267 300 0 16 Feb 2019
Towards moderate overparameterization: global convergence guarantees for training shallow neural networksIEEE Journal on Selected Areas in Information Theory (JSAIT), 2019 Samet Oymak Mahdi Soltanolkotabi 193 336 0 12 Feb 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks Sanjeev Arora S. Du Wei Hu Zhiyuan Li Ruosong Wang MLT 544 1,024 0 24 Jan 2019
On Lazy Training in Differentiable Programming Lénaïc Chizat Edouard Oyallon Francis R. Bach 466 901 0 19 Dec 2018
Gradient Descent Finds Global Minima of Deep Neural NetworksInternational Conference on Machine Learning (ICML), 2018 S. Du Jason D. Lee Haochuan Li Liwei Wang Masayoshi Tomizuka ODL 755 1,184 0 09 Nov 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks S. Du Xiyu Zhai Barnabás Póczós Aarti Singh MLT ODL 659 1,333 0 04 Oct 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks Arthur Jacot Franck Gabriel Clément Hongler 1.6K 3,612 0 20 Jun 2018
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport Lénaïc Chizat Francis R. Bach OT 371 791 0 24 May 2018
Gaussian Process Behaviour in Wide Deep Neural Networks A. G. Matthews Mark Rowland Jiri Hron Richard Turner Zoubin Ghahramani BDL 373 592 0 30 Apr 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks Song Mei Andrea Montanari Phan-Minh Nguyen MLT 338 920 0 18 Apr 2018
Deep Neural Networks as Gaussian Processes Jaehoon Lee Yasaman Bahri Roman Novak S. Schoenholz Jeffrey Pennington Jascha Narain Sohl-Dickstein UQCV BDL 568 1,174 0 01 Nov 2017
The spectrum of kernel random matrices N. Karoui 387 235 0 04 Jan 2010