ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.03224
  4. Cited By
Benefit of deep learning with non-convex noisy gradient descent:
  Provable excess risk bound and superiority to kernel methods

Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods

International Conference on Learning Representations (ICLR), 2020
6 December 2020
Taiji Suzuki
Shunta Akiyama
    MLT
ArXiv (abs)PDFHTML

Papers citing "Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods"

11 / 11 papers shown
Operator Learning Using Random Features: A Tool for Scientific Computing
Operator Learning Using Random Features: A Tool for Scientific ComputingSIAM Review (SIAM Rev.), 2024
Nicholas H. Nelsen
Andrew M. Stuart
314
26
0
12 Aug 2024
SGD Finds then Tunes Features in Two-Layer Neural Networks with
  near-Optimal Sample Complexity: A Case Study in the XOR problem
SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problemInternational Conference on Learning Representations (ICLR), 2023
Margalit Glasgow
MLT
405
24
0
26 Sep 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle
  dynamics
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamicsAnnual Conference Computational Learning Theory (COLT), 2023
Emmanuel Abbe
Enric Boix-Adserà
Theodor Misiakiewicz
FedMLMLT
370
126
0
21 Feb 2023
Stability and Generalization Analysis of Gradient Methods for Shallow
  Neural Networks
Stability and Generalization Analysis of Gradient Methods for Shallow Neural NetworksNeural Information Processing Systems (NeurIPS), 2022
Yunwen Lei
Rong Jin
Yiming Ying
MLT
320
26
0
19 Sep 2022
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student
  Settings and its Superiority to Kernel Methods
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel MethodsInternational Conference on Learning Representations (ICLR), 2022
Shunta Akiyama
Taiji Suzuki
279
9
0
30 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step
  Improves the Representation
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the RepresentationNeural Information Processing Systems (NeurIPS), 2022
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
287
140
0
03 May 2022
Stability & Generalisation of Gradient Descent for Shallow Neural
  Networks without the Neural Tangent Kernel
Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent KernelNeural Information Processing Systems (NeurIPS), 2021
Dominic Richards
Ilja Kuzborskij
207
37
0
27 Jul 2021
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks
  in Teacher-Student Setting
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student SettingInternational Conference on Machine Learning (ICML), 2021
Shunta Akiyama
Taiji Suzuki
MLT
325
16
0
11 Jun 2021
Classifying high-dimensional Gaussian mixtures: Where kernel methods
  fail and neural networks succeed
Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeedInternational Conference on Machine Learning (ICML), 2021
Maria Refinetti
Sebastian Goldt
Florent Krzakala
Lenka Zdeborová
265
83
0
23 Feb 2021
Dimension-free convergence rates for gradient Langevin dynamics in RKHS
Dimension-free convergence rates for gradient Langevin dynamics in RKHSAnnual Conference Computational Learning Theory (COLT), 2020
Boris Muzellec
Kanji Sato
Mathurin Massias
Taiji Suzuki
382
12
0
29 Feb 2020
Deep learning is adaptive to intrinsic dimensionality of model
  smoothness in anisotropic Besov space
Deep learning is adaptive to intrinsic dimensionality of model smoothness in anisotropic Besov spaceNeural Information Processing Systems (NeurIPS), 2019
Taiji Suzuki
Atsushi Nitanda
402
74
0
28 Oct 2019
1
Page 1 of 1