Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1710.10174
Cited By
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data
27 October 2017
Alon Brutzkus
Amir Globerson
Eran Malach
Shai Shalev-Shwartz
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data"
50 / 52 papers shown
Title
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Ruiqi Zhang
Jingfeng Wu
Licong Lin
Peter L. Bartlett
20
0
0
05 Apr 2025
SCoTTi: Save Computation at Training Time with an adaptive framework
Ziyu Li
Enzo Tartaglione
Van-Tam Nguyen
31
0
0
19 Dec 2023
Polynomially Over-Parameterized Convolutional Neural Networks Contain Structured Strong Winning Lottery Tickets
A. D. Cunha
Francesco d’Amore
Emanuele Natale
MLT
19
1
0
16 Nov 2023
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Hancheng Min
Enrique Mallada
René Vidal
MLT
32
19
0
24 Jul 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
Puyu Wang
Yunwen Lei
Di Wang
Yiming Ying
Ding-Xuan Zhou
MLT
27
3
0
26 May 2023
Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off
Shaoyi Huang
Bowen Lei
Dongkuan Xu
Hongwu Peng
Yue Sun
Mimi Xie
Caiwen Ding
18
19
0
30 Nov 2022
Do highly over-parameterized neural networks generalize since bad solutions are rare?
Julius Martinetz
T. Martinetz
22
1
0
07 Nov 2022
Theoretical Guarantees for Permutation-Equivariant Quantum Neural Networks
Louis Schatzki
Martín Larocca
Quynh T. Nguyen
F. Sauvage
M. Cerezo
27
84
0
18 Oct 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Yossi Arjevani
M. Field
16
8
0
12 Oct 2022
Implicit Full Waveform Inversion with Deep Neural Representation
Jian-jun Sun
K. Innanen
AI4CE
32
32
0
08 Sep 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
32
2
0
13 Jun 2022
Deep Layer-wise Networks Have Closed-Form Weights
Chieh-Tsai Wu
A. Masoomi
A. Gretton
Jennifer Dy
29
3
0
01 Feb 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
28
3
0
28 Jan 2022
How does unlabeled data improve generalization in self-training? A one-hidden-layer theoretical analysis
Shuai Zhang
M. Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
SSL
MLT
39
22
0
21 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Regularization by Misclassification in ReLU Neural Networks
Elisabetta Cornacchia
Jan Hązła
Ido Nachum
Amir Yehudayoff
NoLa
18
2
0
03 Nov 2021
Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks
Tolga Ergen
Mert Pilanci
24
16
0
18 Oct 2021
Theory of overparametrization in quantum neural networks
Martín Larocca
Nathan Ju
Diego García-Martín
Patrick J. Coles
M. Cerezo
32
188
0
23 Sep 2021
Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent
Spencer Frei
Quanquan Gu
15
25
0
25 Jun 2021
Gradient Starvation: A Learning Proclivity in Neural Networks
Mohammad Pezeshki
Sekouba Kaba
Yoshua Bengio
Aaron Courville
Doina Precup
Guillaume Lajoie
MLT
45
257
0
18 Nov 2020
LOss-Based SensiTivity rEgulaRization: towards deep sparse neural networks
Enzo Tartaglione
Andrea Bragagnolo
A. Fiandrotti
Marco Grangetto
ODL
UQCV
13
34
0
16 Nov 2020
Deep Learning is Singular, and That's Good
Daniel Murfet
Susan Wei
Mingming Gong
Hui Li
Jesse Gell-Redman
T. Quella
UQCV
16
26
0
22 Oct 2020
Predicting Training Time Without Training
L. Zancato
Alessandro Achille
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
18
24
0
28 Aug 2020
Neural Anisotropy Directions
Guillermo Ortiz-Jiménez
Apostolos Modas
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
26
16
0
17 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
14
37
0
12 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
MLT
AAML
27
146
0
20 May 2020
Symmetry & critical points for a model shallow neural network
Yossi Arjevani
M. Field
26
13
0
23 Mar 2020
Convex Geometry and Duality of Over-parameterized Neural Networks
Tolga Ergen
Mert Pilanci
MLT
26
54
0
25 Feb 2020
An Optimization and Generalization Analysis for Max-Pooling Networks
Alon Brutzkus
Amir Globerson
MLT
AI4CE
11
4
0
22 Feb 2020
Learning Parities with Neural Networks
Amit Daniely
Eran Malach
13
76
0
18 Feb 2020
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity
Shiyu Liang
Ruoyu Sun
R. Srikant
25
19
0
31 Dec 2019
How does topology influence gradient propagation and model performance of deep networks with DenseNet-type skip connections?
Kartikeya Bhardwaj
Guihong Li
R. Marculescu
27
1
0
02 Oct 2019
Neural ODEs as the Deep Limit of ResNets with constant weights
B. Avelin
K. Nystrom
ODL
32
31
0
28 Jun 2019
On the Noisy Gradient Descent that Generalizes as SGD
Jingfeng Wu
Wenqing Hu
Haoyi Xiong
Jun Huan
Vladimir Braverman
Zhanxing Zhu
MLT
16
10
0
18 Jun 2019
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems
Atsushi Nitanda
Geoffrey Chinot
Taiji Suzuki
MLT
8
33
0
23 May 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
26
350
0
27 Mar 2019
Is Deeper Better only when Shallow is Good?
Eran Malach
Shai Shalev-Shwartz
20
45
0
08 Mar 2019
A Priori Estimates of the Population Risk for Residual Networks
E. Weinan
Chao Ma
Qingcan Wang
UQCV
15
61
0
06 Mar 2019
Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization
Hesham Mostafa
Xin Wang
29
307
0
15 Feb 2019
On a Sparse Shortcut Topology of Artificial Neural Networks
Fenglei Fan
Dayang Wang
Hengtao Guo
Qikui Zhu
Pingkun Yan
Ge Wang
Hengyong Yu
35
21
0
22 Nov 2018
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
11
446
0
21 Nov 2018
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
Chulhee Yun
S. Sra
Ali Jadbabaie
13
117
0
17 Oct 2018
A Priori Estimates of the Population Risk for Two-layer Neural Networks
Weinan E
Chao Ma
Lei Wu
21
130
0
15 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
J. Lee
Qiang Liu
Tengyu Ma
16
243
0
12 Oct 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
6
280
0
04 Oct 2018
Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization
G. Wang
G. Giannakis
Jie Chen
MLT
22
131
0
14 Aug 2018
Generalization Error in Deep Learning
Daniel Jakubovitz
Raja Giryes
M. Rodrigues
AI4CE
16
109
0
03 Aug 2018
ResNet with one-neuron hidden layers is a Universal Approximator
Hongzhou Lin
Stefanie Jegelka
28
227
0
28 Jun 2018
When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models?
Tengyu Xu
Yi Zhou
Kaiyi Ji
Yingbin Liang
15
19
0
12 Jun 2018
Data augmentation instead of explicit regularization
Alex Hernández-García
Peter König
30
141
0
11 Jun 2018
1
2
Next