ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.08367
  4. Cited By
Gradient Descent Quantizes ReLU Network Features

Gradient Descent Quantizes ReLU Network Features

22 March 2018
Hartmut Maennel
Olivier Bousquet
Sylvain Gelly
    MLT
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Quantizes ReLU Network Features"

50 / 55 papers shown
Title
Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks
D. Kunin
Giovanni Luca Marchetti
F. Chen
Dhruva Karkada
James B. Simon
M. DeWeese
Surya Ganguli
Nina Miolane
18
0
0
06 Jun 2025
Benignity of loss landscape with weight decay requires both large overparametrization and initialization
Benignity of loss landscape with weight decay requires both large overparametrization and initialization
Etienne Boursier
Matthew Bowditch
Matthias Englert
R. Lazic
32
0
0
28 May 2025
An overview of condensation phenomenon in deep learning
An overview of condensation phenomenon in deep learning
Zhi-Qin John Xu
Yaoyu Zhang
Zhangchen Zhou
AI4CE
66
4
0
13 Apr 2025
The Spectral Bias of Shallow Neural Network Learning is Shaped by the Choice of Non-linearity
Justin Sahs
Ryan Pyle
Fabio Anselmi
Ankit B. Patel
100
0
0
13 Mar 2025
Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias
Rui Lu
Runzhe Wang
Kaifeng Lyu
Xitai Jiang
Gao Huang
Mengdi Wang
DiffM
129
2
0
05 Mar 2025
Convergence of Shallow ReLU Networks on Weakly Interacting Data
Convergence of Shallow ReLU Networks on Weakly Interacting Data
Léo Dana
Francis R. Bach
Loucas Pillaud-Vivien
MLT
95
2
0
24 Feb 2025
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Berfin Simsek
Amire Bendjeddou
Daniel Hsu
183
3
0
13 Nov 2024
Swing-by Dynamics in Concept Learning and Compositional Generalization
Swing-by Dynamics in Concept Learning and Compositional Generalization
Yongyi Yang
Core Francisco Park
Ekdeep Singh Lubana
Maya Okawa
Wei Hu
Hidenori Tanaka
CoGeDiffM
50
0
0
10 Oct 2024
Simplicity bias and optimization threshold in two-layer ReLU networks
Simplicity bias and optimization threshold in two-layer ReLU networks
Etienne Boursier
Nicolas Flammarion
93
4
0
03 Oct 2024
Local Linear Recovery Guarantee of Deep Neural Networks at
  Overparameterization
Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization
Yaoyu Zhang
Leyang Zhang
Zhongwang Zhang
Zhiwei Bai
73
0
0
26 Jun 2024
Get rich quick: exact solutions reveal how unbalanced initializations
  promote rapid feature learning
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
127
18
0
10 Jun 2024
Can Implicit Bias Imply Adversarial Robustness?
Can Implicit Bias Imply Adversarial Robustness?
Hancheng Min
Rene Vidal
83
3
0
24 May 2024
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
Akshay Kumar
Jarvis Haupt
ODL
103
4
0
12 Mar 2024
On the dynamics of three-layer neural networks: initial condensation
On the dynamics of three-layer neural networks: initial condensation
Zheng-an Chen
Tao Luo
MLTAI4CE
39
3
0
25 Feb 2024
A topological description of loss surfaces based on Betti Numbers
A topological description of loss surfaces based on Betti Numbers
Maria Sofia Bucarelli
Giuseppe Alessio D’Inverno
Monica Bianchini
F. Scarselli
Fabrizio Silvestri
49
2
0
08 Jan 2024
SGD Finds then Tunes Features in Two-Layer Neural Networks with
  near-Optimal Sample Complexity: A Case Study in the XOR problem
SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem
Margalit Glasgow
MLT
141
14
0
26 Sep 2023
RHINO: Regularizing the Hash-based Implicit Neural Representation
RHINO: Regularizing the Hash-based Implicit Neural Representation
Hao Zhu
Feng Liu
Qi Zhang
Xun Cao
Zhan Ma
60
10
0
22 Sep 2023
Early Neuron Alignment in Two-layer ReLU Networks with Small
  Initialization
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Hancheng Min
Enrique Mallada
René Vidal
MLT
88
23
0
24 Jul 2023
Optimistic Estimate Uncovers the Potential of Nonlinear Models
Optimistic Estimate Uncovers the Potential of Nonlinear Models
Yaoyu Zhang
Zhongwang Zhang
Leyang Zhang
Zhiwei Bai
Yaoyu Zhang
Z. Xu
42
5
0
18 Jul 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias
  for Correlated Inputs
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs
D. Chistikov
Matthias Englert
R. Lazic
MLT
94
12
0
10 Jun 2023
Loss Spike in Training Neural Networks
Loss Spike in Training Neural Networks
Zhongwang Zhang
Z. Xu
72
7
0
20 May 2023
Understanding the Initial Condensation of Convolutional Neural Networks
Understanding the Initial Condensation of Convolutional Neural Networks
Zhangchen Zhou
Hanxu Zhou
Yuqing Li
Zhi-Qin John Xu
MLTAI4CE
54
5
0
17 May 2023
Phase Diagram of Initial Condensation for Two-layer Neural Networks
Phase Diagram of Initial Condensation for Two-layer Neural Networks
Zheng Chen
Yuqing Li
Yaoyu Zhang
Zhaoguang Zhou
Z. Xu
MLTAI4CE
104
11
0
12 Mar 2023
Linear Stability Hypothesis and Rank Stratification for Nonlinear Models
Linear Stability Hypothesis and Rank Stratification for Nonlinear Models
Yaoyu Zhang
Zhongwang Zhang
Leyang Zhang
Zhiwei Bai
Yaoyu Zhang
Z. Xu
51
8
0
21 Nov 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
133
6
0
27 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the
  ugly (initialization)
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
91
21
0
15 Sep 2022
Implicit regularization of dropout
Implicit regularization of dropout
Zhongwang Zhang
Zhi-Qin John Xu
67
29
0
13 Jul 2022
Learning sparse features can lead to overfitting in neural networks
Learning sparse features can lead to overfitting in neural networks
Leonardo Petrini
Francesco Cagnetta
Eric Vanden-Eijnden
Matthieu Wyart
MLT
94
26
0
24 Jun 2022
Label noise (stochastic) gradient descent implicitly solves the Lasso
  for quadratic parametrisation
Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation
Loucas Pillaud-Vivien
J. Reygner
Nicolas Flammarion
NoLa
89
34
0
20 Jun 2022
Intrinsic dimensionality and generalization properties of the
  $\mathcal{R}$-norm inductive bias
Intrinsic dimensionality and generalization properties of the R\mathcal{R}R-norm inductive bias
Navid Ardeshir
Daniel J. Hsu
Clayton Sanford
CMLAI4CE
110
6
0
10 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and
  orthogonal inputs
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
77
61
0
02 Jun 2022
Empirical Phase Diagram for Three-layer Neural Networks with Infinite
  Width
Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width
Hanxu Zhou
Qixuan Zhou
Zhenyuan Jin
Yaoyu Zhang
Yaoyu Zhang
Zhi-Qin John Xu
55
22
0
24 May 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU
  Networks: Convergence Guarantees and Implicit Bias
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Itay Safran
Gal Vardi
Jason D. Lee
MLT
101
24
0
18 May 2022
On Regularizing Coordinate-MLPs
On Regularizing Coordinate-MLPs
Sameera Ramasinghe
L. MacDonald
Simon Lucey
206
5
0
01 Feb 2022
Embedding Principle: a hierarchical structure of loss landscape of deep
  neural networks
Embedding Principle: a hierarchical structure of loss landscape of deep neural networks
Yaoyu Zhang
Yuqing Li
Zhongwang Zhang
Yaoyu Zhang
Z. Xu
82
23
0
30 Nov 2021
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks
Aleksandr Shevchenko
Vyacheslav Kungurtsev
Marco Mondelli
MLT
93
13
0
03 Nov 2021
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity
  Bias
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias
Kaifeng Lyu
Zhiyuan Li
Runzhe Wang
Sanjeev Arora
MLT
100
76
0
26 Oct 2021
The Convex Geometry of Backpropagation: Neural Network Gradient Flows
  Converge to Extreme Points of the Dual Convex Program
The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program
Yifei Wang
Mert Pilanci
MLTMDE
84
11
0
13 Oct 2021
Convergence analysis for gradient flows in the training of artificial
  neural networks with ReLU activation
Convergence analysis for gradient flows in the training of artificial neural networks with ReLU activation
Arnulf Jentzen
Adrian Riekert
55
23
0
09 Jul 2021
Towards Understanding the Condensation of Neural Networks at Initial
  Training
Towards Understanding the Condensation of Neural Networks at Initial Training
Hanxu Zhou
Qixuan Zhou
Yaoyu Zhang
Yaoyu Zhang
Z. Xu
MLTAI4CE
79
30
0
25 May 2021
Initializing ReLU networks in an expressive subspace of weights
Initializing ReLU networks in an expressive subspace of weights
Dayal Singh
J. SreejithG
22
4
0
23 Mar 2021
How Neural Networks Extrapolate: From Feedforward to Graph Neural
  Networks
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
Keyulu Xu
Mozhi Zhang
Jingling Li
S. Du
Ken-ichi Kawarabayashi
Stefanie Jegelka
MLT
182
313
0
24 Sep 2020
Geometric compression of invariant manifolds in neural nets
Geometric compression of invariant manifolds in neural nets
J. Paccolat
Leonardo Petrini
Mario Geiger
Kevin Tyloo
Matthieu Wyart
MLT
105
36
0
22 Jul 2020
Phase diagram for two-layer ReLU neural networks at infinite-width limit
Phase diagram for two-layer ReLU neural networks at infinite-width limit
Yaoyu Zhang
Zhi-Qin John Xu
Zheng Ma
Yaoyu Zhang
81
61
0
15 Jul 2020
Implicit Convex Regularizers of CNN Architectures: Convex Optimization
  of Two- and Three-Layer Networks in Polynomial Time
Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time
Tolga Ergen
Mert Pilanci
76
9
0
26 Jun 2020
On Sparsity in Overparametrised Shallow ReLU Networks
On Sparsity in Overparametrised Shallow ReLU Networks
Jaume de Dios
Joan Bruna
57
14
0
18 Jun 2020
Convex Geometry and Duality of Over-parameterized Neural Networks
Convex Geometry and Duality of Over-parameterized Neural Networks
Tolga Ergen
Mert Pilanci
MLT
127
56
0
25 Feb 2020
Revealing the Structure of Deep Neural Networks via Convex Duality
Revealing the Structure of Deep Neural Networks via Convex Duality
Tolga Ergen
Mert Pilanci
MLT
87
72
0
22 Feb 2020
Frivolous Units: Wider Networks Are Not Really That Wide
Frivolous Units: Wider Networks Are Not Really That Wide
Stephen Casper
Xavier Boix
Vanessa D’Amario
Ling Guo
Martin Schrimpf
Kasper Vinken
Gabriel Kreiman
55
19
0
10 Dec 2019
How Implicit Regularization of ReLU Neural Networks Characterizes the
  Learned Function -- Part I: the 1-D Case of Two Layers with Random First
  Layer
How Implicit Regularization of ReLU Neural Networks Characterizes the Learned Function -- Part I: the 1-D Case of Two Layers with Random First Layer
Jakob Heiss
Josef Teichmann
Hanna Wutte
MLT
19
5
0
07 Nov 2019
12
Next