ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.04674
  4. Cited By
Towards moderate overparameterization: global convergence guarantees for
  training shallow neural networks

Towards moderate overparameterization: global convergence guarantees for training shallow neural networks

12 February 2019
Samet Oymak
Mahdi Soltanolkotabi
ArXiv (abs)PDFHTML

Papers citing "Towards moderate overparameterization: global convergence guarantees for training shallow neural networks"

50 / 130 papers shown
Title
Implicit Regularization of the Deep Inverse Prior Trained with Inertia
Implicit Regularization of the Deep Inverse Prior Trained with Inertia
Nathan Buskulic
Jalal Fadil
Yvain Quéau
33
1
0
03 Jun 2025
Curse of Dimensionality in Neural Network Optimization
Curse of Dimensionality in Neural Network Optimization
Sanghoon Na
Haizhao Yang
77
0
0
07 Feb 2025
The Persistence of Neural Collapse Despite Low-Rank Bias: An Analytic
  Perspective Through Unconstrained Features
The Persistence of Neural Collapse Despite Low-Rank Bias: An Analytic Perspective Through Unconstrained Features
Connall Garrod
Jonathan P. Keating
65
4
0
30 Oct 2024
Reparameterization invariance in approximate Bayesian inference
Reparameterization invariance in approximate Bayesian inference
Hrittik Roy
M. Miani
Carl Henrik Ek
Philipp Hennig
Marvin Pfortner
Lukas Tatzel
Søren Hauberg
BDL
119
9
0
05 Jun 2024
Clinical Domain Knowledge-Derived Template Improves Post Hoc AI
  Explanations in Pneumothorax Classification
Clinical Domain Knowledge-Derived Template Improves Post Hoc AI Explanations in Pneumothorax Classification
Han Yuan
Chuan Hong
Pengtao Jiang
Gangming Zhao
Nguyen Tuan Anh Tran
Xinxing Xu
Yet Yen Yan
Nan Liu
62
11
0
26 Mar 2024
Understanding the Double Descent Phenomenon in Deep Learning
Understanding the Double Descent Phenomenon in Deep Learning
Marc Lafon
Alexandre Thomas
94
2
0
15 Mar 2024
Capacity of the treelike sign perceptrons neural networks with one
  hidden layer -- RDT based upper bounds
Capacity of the treelike sign perceptrons neural networks with one hidden layer -- RDT based upper bounds
M. Stojnic
49
4
0
13 Dec 2023
Fundamental Limits of Deep Learning-Based Binary Classifiers Trained with Hinge Loss
Fundamental Limits of Deep Learning-Based Binary Classifiers Trained with Hinge Loss
T. Getu
Georges Kaddoum
M. Bennis
82
1
0
13 Sep 2023
Random feature approximation for general spectral methods
Random feature approximation for general spectral methods
Mike Nguyen
Nicole Mücke
40
1
0
29 Aug 2023
Six Lectures on Linearized Neural Networks
Six Lectures on Linearized Neural Networks
Theodor Misiakiewicz
Andrea Montanari
131
13
0
25 Aug 2023
Convergence of Two-Layer Regression with Nonlinear Units
Convergence of Two-Layer Regression with Nonlinear Units
Yichuan Deng
Zhao Song
Shenghao Xie
80
7
0
16 Aug 2023
Memory capacity of two layer neural networks with smooth activations
Memory capacity of two layer neural networks with smooth activations
Liam Madden
Christos Thrampoulidis
MLT
42
5
0
03 Aug 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in
  1-layer Transformer
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
105
79
0
25 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable
  Data
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Hossein Taheri
Christos Thrampoulidis
MLT
42
3
0
22 May 2023
Convergence Guarantees of Overparametrized Wide Deep Inverse Prior
Convergence Guarantees of Overparametrized Wide Deep Inverse Prior
Nathan Buskulic
Yvain Quéau
M. Fadili
BDL
58
2
0
20 Mar 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for
  Learning a Single Neuron
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
102
16
0
20 Feb 2023
Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
  Networks
Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural Networks
Shuai Zhang
Ming Wang
Pin-Yu Chen
Sijia Liu
Songtao Lu
Miaoyuan Liu
MLT
102
17
0
06 Feb 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning
François Caron
Fadhel Ayed
Paul Jung
Hoileong Lee
Juho Lee
Hongseok Yang
121
2
0
02 Feb 2023
Learning Lipschitz Functions by GD-trained Shallow Overparameterized
  ReLU Neural Networks
Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks
Ilja Kuzborskij
Csaba Szepesvári
68
4
0
28 Dec 2022
Learning threshold neurons via the "edge of stability"
Learning threshold neurons via the "edge of stability"
Kwangjun Ahn
Sébastien Bubeck
Sinho Chewi
Y. Lee
Felipe Suarez
Yi Zhang
MLT
100
41
0
14 Dec 2022
Matching DNN Compression and Cooperative Training with Resources and
  Data Availability
Matching DNN Compression and Cooperative Training with Resources and Data Availability
F. Malandrino
G. Giacomo
Armin Karamzade
Marco Levorato
C. Chiasserini
106
9
0
02 Dec 2022
Bypass Exponential Time Preprocessing: Fast Neural Network Training via
  Weight-Data Correlation Preprocessing
Bypass Exponential Time Preprocessing: Fast Neural Network Training via Weight-Data Correlation Preprocessing
Josh Alman
Jiehao Liang
Zhao Song
Ruizhe Zhang
Danyang Zhuo
136
31
0
25 Nov 2022
Spectral Evolution and Invariance in Linear-width Neural Networks
Spectral Evolution and Invariance in Linear-width Neural Networks
Zhichao Wang
A. Engel
Anand D. Sarwate
Ioana Dumitriu
Tony Chiang
114
18
0
11 Nov 2022
Cold Start Streaming Learning for Deep Networks
Cold Start Streaming Learning for Deep Networks
Cameron R. Wolfe
Anastasios Kyrillidis
CLL
50
2
0
09 Nov 2022
Finite Sample Identification of Wide Shallow Neural Networks with Biases
Finite Sample Identification of Wide Shallow Neural Networks with Biases
M. Fornasier
T. Klock
Marco Mondelli
Michael Rauchensteiner
47
6
0
08 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer
  Neural Networks
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
75
5
0
28 Oct 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work
When Expressivity Meets Trainability: Fewer than nnn Neurons Can Work
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Ruoyu Sun
Zhi-Quan Luo
117
10
0
21 Oct 2022
On skip connections and normalisation layers in deep optimisation
On skip connections and normalisation layers in deep optimisation
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
74
2
0
10 Oct 2022
Approximation results for Gradient Descent trained Shallow Neural
  Networks in $1d$
Approximation results for Gradient Descent trained Shallow Neural Networks in 1d1d1d
R. Gentile
G. Welper
ODL
95
7
0
17 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the
  ugly (initialization)
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
93
21
0
15 Sep 2022
Generalization Properties of NAS under Activation and Skip Connection
  Search
Generalization Properties of NAS under Activation and Skip Connection Search
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
AI4CE
90
17
0
15 Sep 2022
A Sublinear Adversarial Training Algorithm
A Sublinear Adversarial Training Algorithm
Yeqi Gao
Lianke Qin
Zhao Song
Yitan Wang
GAN
77
25
0
10 Aug 2022
Training Overparametrized Neural Networks in Sublinear Time
Training Overparametrized Neural Networks in Sublinear Time
Yichuan Deng
Han Hu
Zhao Song
Omri Weinstein
Danyang Zhuo
BDL
90
28
0
09 Aug 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization
  and Sampling Complexity
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
Jianyi Yang
Shaolei Ren
68
3
0
02 Jul 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A
  Worst Case Analysis
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao Song
David P. Woodruff
97
15
0
26 Jun 2022
Spectral Bias Outside the Training Set for Deep Networks in the Kernel
  Regime
Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime
Benjamin Bowman
Guido Montúfar
82
15
0
06 Jun 2022
Blind Estimation of a Doubly Selective OFDM Channel: A Deep Learning
  Algorithm and Theory
Blind Estimation of a Doubly Selective OFDM Channel: A Deep Learning Algorithm and Theory
T. Getu
N. Golmie
D. Griffith
47
2
0
30 May 2022
Randomly Initialized One-Layer Neural Networks Make Data Linearly
  Separable
Randomly Initialized One-Layer Neural Networks Make Data Linearly Separable
Promit Ghosal
Srinath Mahankali
Yihang Sun
MLT
53
5
0
24 May 2022
On Feature Learning in Neural Networks with Global Convergence
  Guarantees
On Feature Learning in Neural Networks with Global Convergence Guarantees
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
78
13
0
22 Apr 2022
Transition to Linearity of Wide Neural Networks is an Emerging Property
  of Assembling Weak Models
Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models
Chaoyue Liu
Libin Zhu
M. Belkin
45
4
0
10 Mar 2022
S-Rocket: Selective Random Convolution Kernels for Time Series
  Classification
S-Rocket: Selective Random Convolution Kernels for Time Series Classification
Hojjat Salehinejad
Yang Wang
Yuanhao Yu
Jingshan Tang
S. Valaee
AI4TS
56
15
0
07 Mar 2022
On the Omnipresence of Spurious Local Minima in Certain Neural Network
  Training Problems
On the Omnipresence of Spurious Local Minima in Certain Neural Network Training Problems
C. Christof
Julia Kowalczyk
77
8
0
23 Feb 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic
  Gradient Descent for Shallow Neural Networks
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
54
3
0
28 Jan 2022
Implicit Bias of MSE Gradient Optimization in Underparameterized Neural
  Networks
Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks
Benjamin Bowman
Guido Montúfar
106
11
0
12 Jan 2022
AutoBalance: Optimized Loss Functions for Imbalanced Data
AutoBalance: Optimized Loss Functions for Imbalanced Data
Mingchen Li
Xuechen Zhang
Christos Thrampoulidis
Jiasi Chen
Samet Oymak
66
68
0
04 Jan 2022
Training Multi-Layer Over-Parametrized Neural Network in Subquadratic
  Time
Training Multi-Layer Over-Parametrized Neural Network in Subquadratic Time
Zhao Song
Licheng Zhang
Ruizhe Zhang
112
66
0
14 Dec 2021
On the Convergence of Shallow Neural Network Training with Randomly
  Masked Neurons
On the Convergence of Shallow Neural Network Training with Randomly Masked Neurons
Fangshuo Liao
Anastasios Kyrillidis
113
16
0
05 Dec 2021
Pixelated Butterfly: Simple and Efficient Sparse training for Neural
  Network Models
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
Tri Dao
Beidi Chen
Kaizhao Liang
Jiaming Yang
Zhao Song
Atri Rudra
Christopher Ré
130
79
0
30 Nov 2021
SGD Through the Lens of Kolmogorov Complexity
SGD Through the Lens of Kolmogorov Complexity
Gregory Schwartzman
75
1
0
10 Nov 2021
Subquadratic Overparameterization for Shallow Neural Networks
Subquadratic Overparameterization for Shallow Neural Networks
Chaehwan Song
Ali Ramezani-Kebrya
Thomas Pethick
Armin Eftekhari
Volkan Cevher
81
31
0
02 Nov 2021
123
Next