ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.02410
  4. Cited By
A Local Convergence Theory for Mildly Over-Parameterized Two-Layer
  Neural Network
v1v2 (latest)

A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network

Annual Conference Computational Learning Theory (COLT), 2021
4 February 2021
Mo Zhou
Rong Ge
Chi Jin
ArXiv (abs)PDFHTML

Papers citing "A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network"

38 / 38 papers shown
A Derandomization Framework for Structure Discovery: Applications in Neural Networks and Beyond
A Derandomization Framework for Structure Discovery: Applications in Neural Networks and Beyond
Nikos Tsikouras
Yorgos Pantis
Ioannis Mitliagkas
Christos Tzamos
BDL
210
0
0
22 Oct 2025
Graph Coloring for Multi-Task Learning
Graph Coloring for Multi-Task Learning
Santosh Patapati
338
0
0
21 Sep 2025
From Sublinear to Linear: Fast Convergence in Deep Networks via Locally Polyak-Lojasiewicz Regions
From Sublinear to Linear: Fast Convergence in Deep Networks via Locally Polyak-Lojasiewicz Regions
Agnideep Aich
Ashit Aich
Bruce Wade
220
3
0
29 Jul 2025
Deep Learning Optimization Using Self-Adaptive Weighted Auxiliary Variables
Deep Learning Optimization Using Self-Adaptive Weighted Auxiliary Variables
Yaru Liu
Yiqi Gu
Michael K. Ng
ODL
242
1
0
30 Apr 2025
Convergence of Shallow ReLU Networks on Weakly Interacting Data
Convergence of Shallow ReLU Networks on Weakly Interacting Data
Léo Dana
Francis R. Bach
Loucas Pillaud-Vivien
MLT
402
3
0
24 Feb 2025
Curse of Dimensionality in Neural Network Optimization
Curse of Dimensionality in Neural Network Optimization
Sanghoon Na
Haizhao Yang
426
0
0
07 Feb 2025
Convex Relaxations of ReLU Neural Networks Approximate Global Optima in
  Polynomial Time
Convex Relaxations of ReLU Neural Networks Approximate Global Optima in Polynomial TimeInternational Conference on Machine Learning (ICML), 2024
Sungyoon Kim
Mert Pilanci
588
10
0
06 Feb 2024
Convergence Analysis for Learning Orthonormal Deep Linear Neural
  Networks
Convergence Analysis for Learning Orthonormal Deep Linear Neural NetworksIEEE Signal Processing Letters (IEEE SPL), 2023
Zhen Qin
Xuwei Tan
Zhihui Zhu
367
1
0
24 Nov 2023
On the Optimization and Generalization of Multi-head Attention
On the Optimization and Generalization of Multi-head Attention
Puneesh Deora
Rouzbeh Ghaderi
Hossein Taheri
Christos Thrampoulidis
MLT
379
47
0
19 Oct 2023
Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets
Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets
Pulkit Gopalani
Samyak Jha
Anirbit Mukherjee
335
3
0
17 Sep 2023
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust
  Neural Network Inference
SAfER: Layer-Level Sensitivity Assessment for Efficient and Robust Neural Network Inference
Edouard Yvinec
Arnaud Dapogny
Kévin Bailly
Xavier Fischer
AAML
262
4
0
09 Aug 2023
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural
  Networks
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural NetworksInternational Conference on Machine Learning (ICML), 2023
Liam Collins
Hamed Hassani
Mahdi Soltanolkotabi
Aryan Mokhtari
Sanjay Shakkottai
623
14
0
13 Jul 2023
Efficient Uncertainty Quantification and Reduction for
  Over-Parameterized Neural Networks
Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural NetworksNeural Information Processing Systems (NeurIPS), 2023
Ziyi Huang
Henry Lam
Haofeng Zhang
UQCV
322
15
0
09 Jun 2023
Toward $L_\infty$-recovery of Nonlinear Functions: A Polynomial Sample
  Complexity Bound for Gaussian Random Fields
Toward L∞L_\inftyL∞​-recovery of Nonlinear Functions: A Polynomial Sample Complexity Bound for Gaussian Random FieldsAnnual Conference Computational Learning Theory (COLT), 2023
Kefan Dong
Tengyu Ma
382
4
0
29 Apr 2023
Depth Separation with Multilayer Mean-Field Networks
Depth Separation with Multilayer Mean-Field NetworksInternational Conference on Learning Representations (ICLR), 2023
Y. Ren
Mo Zhou
Rong Ge
OOD
325
5
0
03 Apr 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for
  Learning a Single Neuron
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single NeuronAnnual Conference Computational Learning Theory (COLT), 2023
Weihang Xu
S. Du
432
22
0
20 Feb 2023
Finite Sample Identification of Wide Shallow Neural Networks with Biases
Finite Sample Identification of Wide Shallow Neural Networks with Biases
M. Fornasier
T. Klock
Marco Mondelli
Michael Rauchensteiner
296
7
0
08 Nov 2022
Learning Single-Index Models with Shallow Neural Networks
Learning Single-Index Models with Shallow Neural NetworksNeural Information Processing Systems (NeurIPS), 2022
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
445
101
0
27 Oct 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work
When Expressivity Meets Trainability: Fewer than nnn Neurons Can WorkNeural Information Processing Systems (NeurIPS), 2022
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Tian Ding
Jianfeng Yao
368
11
0
21 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets
Global Convergence of SGD On Two Layer Neural NetsInformation and Inference A Journal of the IMA (JIII), 2022
Pulkit Gopalani
Anirbit Mukherjee
268
9
0
20 Oct 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with
  SGD
Neural Networks Efficiently Learn Low-Dimensional Representations with SGDInternational Conference on Learning Representations (ICLR), 2022
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
665
66
0
29 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the
  ugly (initialization)
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)Neural Information Processing Systems (NeurIPS), 2022
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
411
26
0
15 Sep 2022
Optimizing the Performative Risk under Weak Convexity Assumptions
Optimizing the Performative Risk under Weak Convexity Assumptions
Yulai Zhao
417
6
0
02 Sep 2022
Intersection of Parallels as an Early Stopping Criterion
Intersection of Parallels as an Early Stopping CriterionInternational Conference on Information and Knowledge Management (CIKM), 2022
Ali Vardasbi
Maarten de Rijke
Mostafa Dehghani
MoMe
202
7
0
19 Aug 2022
Parameter Convex Neural Networks
Parameter Convex Neural Networks
Jingcheng Zhou
Wei Wei
Xing Li
Bowen Pang
Zhiming Zheng
112
1
0
11 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputsNeural Information Processing Systems (NeurIPS), 2022
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODLMLT
342
81
0
02 Jun 2022
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student
  Settings and its Superiority to Kernel Methods
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel MethodsInternational Conference on Learning Representations (ICLR), 2022
Shunta Akiyama
Taiji Suzuki
340
9
0
30 May 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU
  Networks: Convergence Guarantees and Implicit Bias
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit BiasNeural Information Processing Systems (NeurIPS), 2022
Itay Safran
Gal Vardi
Jason D. Lee
MLT
286
24
0
18 May 2022
On Feature Learning in Neural Networks with Global Convergence
  Guarantees
On Feature Learning in Neural Networks with Global Convergence GuaranteesInternational Conference on Learning Representations (ICLR), 2022
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
332
15
0
22 Apr 2022
Parameter identifiability of a deep feedforward ReLU neural network
Parameter identifiability of a deep feedforward ReLU neural networkMachine-mediated learning (ML), 2021
Joachim Bona-Pellissier
François Bachoc
François Malgouyres
378
23
0
24 Dec 2021
Global convergence of ResNets: From finite to infinite width using
  linear parameterization
Global convergence of ResNets: From finite to infinite width using linear parameterization
Raphael Barboni
Gabriel Peyré
Franccois-Xavier Vialard
295
13
0
10 Dec 2021
Gradient Descent on Infinitely Wide Neural Networks: Global Convergence
  and Generalization
Gradient Descent on Infinitely Wide Neural Networks: Global Convergence and Generalization
Francis R. Bach
Lénaïc Chizat
MLT
183
28
0
15 Oct 2021
The Convex Geometry of Backpropagation: Neural Network Gradient Flows
  Converge to Extreme Points of the Dual Convex Program
The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program
Yifei Wang
Mert Pilanci
MLTMDE
263
12
0
13 Oct 2021
Deep Networks Provably Classify Data on Curves
Deep Networks Provably Classify Data on CurvesNeural Information Processing Systems (NeurIPS), 2021
Tingran Wang
Sam Buchanan
D. Gilboa
John N. Wright
283
9
0
29 Jul 2021
Sparse Bayesian Deep Learning for Dynamic System Identification
Sparse Bayesian Deep Learning for Dynamic System Identification
Hongpeng Zhou
Chahine Ibrahim
W. Zheng
Wei Pan
BDL
260
37
0
27 Jul 2021
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks
  in Teacher-Student Setting
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student SettingInternational Conference on Machine Learning (ICML), 2021
Shunta Akiyama
Taiji Suzuki
MLT
339
16
0
11 Jun 2021
A Geometric Analysis of Neural Collapse with Unconstrained Features
A Geometric Analysis of Neural Collapse with Unconstrained FeaturesNeural Information Processing Systems (NeurIPS), 2021
Zhihui Zhu
Tianyu Ding
Jinxin Zhou
Xiao Li
Chong You
Jeremias Sulam
Qing Qu
358
251
0
06 May 2021
Stable Recovery of Entangled Weights: Towards Robust Identification of
  Deep Neural Networks from Minimal Samples
Stable Recovery of Entangled Weights: Towards Robust Identification of Deep Neural Networks from Minimal SamplesApplied and Computational Harmonic Analysis (ACHA), 2021
Christian Fiedler
M. Fornasier
T. Klock
Michael Rauchensteiner
OOD
248
13
0
18 Jan 2021
1
Page 1 of 1