ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.04754
  4. Cited By
Gradient Descent Happens in a Tiny Subspace

Gradient Descent Happens in a Tiny Subspace

12 December 2018
Guy Gur-Ari
Daniel A. Roberts
Ethan Dyer
ArXivPDFHTML

Papers citing "Gradient Descent Happens in a Tiny Subspace"

50 / 163 papers shown
Title
Taxonomizing local versus global structure in neural network loss
  landscapes
Taxonomizing local versus global structure in neural network loss landscapes
Yaoqing Yang
Liam Hodgkinson
Ryan Theisen
Joe Zou
Joseph E. Gonzalez
K. Ramchandran
Michael W. Mahoney
32
36
0
23 Jul 2021
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural
  Networks: A Tale of Symmetry II
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II
Yossi Arjevani
M. Field
28
18
0
21 Jul 2021
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations,
  and Anomalous Diffusion
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous Diffusion
D. Kunin
Javier Sagastuy-Breña
Lauren Gillespie
Eshed Margalit
Hidenori Tanaka
Surya Ganguli
Daniel L. K. Yamins
31
15
0
19 Jul 2021
How many degrees of freedom do we need to train deep networks: a loss
  landscape perspective
How many degrees of freedom do we need to train deep networks: a loss landscape perspective
Brett W. Larsen
Stanislav Fort
Nico Becker
Surya Ganguli
UQCV
13
27
0
13 Jul 2021
Structured Directional Pruning via Perturbation Orthogonal Projection
Structured Directional Pruning via Perturbation Orthogonal Projection
YinchuanLi
XiaofengLiu
YunfengShao
QingWang
YanhuiGeng
27
2
0
12 Jul 2021
Analytic Insights into Structure and Rank of Neural Network Hessian Maps
Analytic Insights into Structure and Rank of Neural Network Hessian Maps
Sidak Pal Singh
Gregor Bachmann
Thomas Hofmann
FAtt
9
32
0
30 Jun 2021
Large Scale Private Learning via Low-rank Reparametrization
Large Scale Private Learning via Low-rank Reparametrization
Da Yu
Huishuai Zhang
Wei Chen
Jian Yin
Tie-Yan Liu
21
100
0
17 Jun 2021
ViViT: Curvature access through the generalized Gauss-Newton's low-rank
  structure
ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure
Felix Dangel
Lukas Tatzel
Philipp Hennig
24
12
0
04 Jun 2021
A study on the plasticity of neural networks
A study on the plasticity of neural networks
Tudor Berariu
Wojciech M. Czarnecki
Soham De
J. Bornschein
Samuel L. Smith
Razvan Pascanu
Claudia Clopath
CLL
AI4CE
17
30
0
31 May 2021
Privately Learning Subspaces
Privately Learning Subspaces
Vikrant Singhal
Thomas Steinke
11
20
0
28 May 2021
Power-law escape rate of SGD
Power-law escape rate of SGD
Takashi Mori
Liu Ziyin
Kangqiao Liu
Masakuni Ueda
8
19
0
20 May 2021
RATT: Leveraging Unlabeled Data to Guarantee Generalization
RATT: Leveraging Unlabeled Data to Guarantee Generalization
Saurabh Garg
Sivaraman Balakrishnan
J. Zico Kolter
Zachary Chase Lipton
28
30
0
01 May 2021
Analyzing Monotonic Linear Interpolation in Neural Network Loss
  Landscapes
Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes
James Lucas
Juhan Bae
Michael Ruogu Zhang
Stanislav Fort
R. Zemel
Roger C. Grosse
MoMe
154
28
0
22 Apr 2021
Low Dimensional Landscape Hypothesis is True: DNNs can be Trained in
  Tiny Subspaces
Low Dimensional Landscape Hypothesis is True: DNNs can be Trained in Tiny Subspaces
Tao Li
Lei Tan
Qinghua Tao
Yipeng Liu
Xiaolin Huang
37
10
0
20 Mar 2021
Intraclass clustering: an implicit learning ability that regularizes
  DNNs
Intraclass clustering: an implicit learning ability that regularizes DNNs
Simon Carbonnelle
Christophe De Vleeschouwer
41
8
0
11 Mar 2021
Hessian Eigenspectra of More Realistic Nonlinear Models
Hessian Eigenspectra of More Realistic Nonlinear Models
Zhenyu Liao
Michael W. Mahoney
17
29
0
02 Mar 2021
Experiments with Rich Regime Training for Deep Learning
Experiments with Rich Regime Training for Deep Learning
Xinyan Li
A. Banerjee
29
2
0
26 Feb 2021
Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for
  Private Learning
Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for Private Learning
Da Yu
Huishuai Zhang
Wei Chen
Tie-Yan Liu
FedML
SILM
94
110
0
25 Feb 2021
Provable Super-Convergence with a Large Cyclical Learning Rate
Provable Super-Convergence with a Large Cyclical Learning Rate
Samet Oymak
33
12
0
22 Feb 2021
Cockpit: A Practical Debugging Tool for the Training of Deep Neural
  Networks
Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks
Frank Schneider
Felix Dangel
Philipp Hennig
43
10
0
12 Feb 2021
Tilting the playing field: Dynamical loss functions for machine learning
Tilting the playing field: Dynamical loss functions for machine learning
M. Ruíz-García
Ge Zhang
S. Schoenholz
Andrea J. Liu
12
11
0
07 Feb 2021
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and
  its Applications to Regularization
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization
Adepu Ravi Sankar
Yash Khasbage
Rahul Vigneswaran
V. Balasubramanian
25
41
0
07 Dec 2020
Quasi-Newton's method in the class gradient defined high-curvature
  subspace
Quasi-Newton's method in the class gradient defined high-curvature subspace
Mark Tuddenham
Adam Prugel-Bennett
Jonathan Hare
ODL
12
7
0
28 Nov 2020
Align, then memorise: the dynamics of learning with feedback alignment
Align, then memorise: the dynamics of learning with feedback alignment
Maria Refinetti
Stéphane dÁscoli
Ruben Ohana
Sebastian Goldt
26
36
0
24 Nov 2020
Improving Neural Network Training in Low Dimensional Random Bases
Improving Neural Network Training in Low Dimensional Random Bases
Frithjof Gressmann
Zach Eaton-Rosen
Carlo Luschi
22
28
0
09 Nov 2020
Accordion: Adaptive Gradient Communication via Critical Learning Regime
  Identification
Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification
Saurabh Agarwal
Hongyi Wang
Kangwook Lee
Shivaram Venkataraman
Dimitris Papailiopoulos
34
25
0
29 Oct 2020
Dissecting Hessian: Understanding Common Structure of Hessian in Neural
  Networks
Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks
Yikai Wu
Xingyu Zhu
Chenwei Wu
Annie Wang
Rong Ge
11
42
0
08 Oct 2020
Pretrained Language Model Embryology: The Birth of ALBERT
Pretrained Language Model Embryology: The Birth of ALBERT
Cheng-Han Chiang
Sung-Feng Huang
Hung-yi Lee
21
39
0
06 Oct 2020
Why Adversarial Interaction Creates Non-Homogeneous Patterns: A
  Pseudo-Reaction-Diffusion Model for Turing Instability
Why Adversarial Interaction Creates Non-Homogeneous Patterns: A Pseudo-Reaction-Diffusion Model for Turing Instability
Litu Rout
AAML
6
1
0
01 Oct 2020
Pruning Neural Networks at Initialization: Why are We Missing the Mark?
Pruning Neural Networks at Initialization: Why are We Missing the Mark?
Jonathan Frankle
Gintare Karolina Dziugaite
Daniel M. Roy
Michael Carbin
14
238
0
18 Sep 2020
A Framework for Private Matrix Analysis
A Framework for Private Matrix Analysis
Jalaj Upadhyay
Sarvagya Upadhyay
13
4
0
06 Sep 2020
Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra
Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra
V. Papyan
14
76
0
27 Aug 2020
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace
  Identification
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification
Yingxue Zhou
Zhiwei Steven Wu
A. Banerjee
16
106
0
07 Jul 2020
AdaSGD: Bridging the gap between SGD and Adam
AdaSGD: Bridging the gap between SGD and Adam
Jiaxuan Wang
Jenna Wiens
9
10
0
30 Jun 2020
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural
  Networks
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks
Wei Hu
Lechao Xiao
Ben Adlam
Jeffrey Pennington
12
61
0
25 Jun 2020
Directional Pruning of Deep Neural Networks
Directional Pruning of Deep Neural Networks
Shih-Kang Chao
Zhanyu Wang
Yue Xing
Guang Cheng
ODL
13
33
0
16 Jun 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory
  Approach to Neural Network Training
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Diego Granziol
S. Zohren
Stephen J. Roberts
ODL
31
48
0
16 Jun 2020
On the training dynamics of deep networks with $L_2$ regularization
On the training dynamics of deep networks with L2L_2L2​ regularization
Aitor Lewkowycz
Guy Gur-Ari
36
53
0
15 Jun 2020
Optimizing Neural Networks via Koopman Operator Theory
Optimizing Neural Networks via Koopman Operator Theory
Akshunna S. Dogra
William T. Redman
11
49
0
03 Jun 2020
Selectivity considered harmful: evaluating the causal impact of class
  selectivity in DNNs
Selectivity considered harmful: evaluating the causal impact of class selectivity in DNNs
Matthew L. Leavitt
Ari S. Morcos
58
33
0
03 Mar 2020
The Early Phase of Neural Network Training
The Early Phase of Neural Network Training
Jonathan Frankle
D. Schwab
Ari S. Morcos
19
171
0
24 Feb 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
237
4,469
0
23 Jan 2020
How neural networks find generalizable solutions: Self-tuned annealing
  in deep learning
How neural networks find generalizable solutions: Self-tuned annealing in deep learning
Yu Feng
Y. Tu
MLT
12
9
0
06 Jan 2020
On the Bias-Variance Tradeoff: Textbooks Need an Update
On the Bias-Variance Tradeoff: Textbooks Need an Update
Brady Neal
18
18
0
17 Dec 2019
Linear Mode Connectivity and the Lottery Ticket Hypothesis
Linear Mode Connectivity and the Lottery Ticket Hypothesis
Jonathan Frankle
Gintare Karolina Dziugaite
Daniel M. Roy
Michael Carbin
MoMe
12
596
0
11 Dec 2019
Neural Spectrum Alignment: Empirical Study
Neural Spectrum Alignment: Empirical Study
Dmitry Kopitkov
Vadim Indelman
27
14
0
19 Oct 2019
Emergent properties of the local geometry of neural loss landscapes
Emergent properties of the local geometry of neural loss landscapes
Stanislav Fort
Surya Ganguli
6
50
0
14 Oct 2019
The asymptotic spectrum of the Hessian of DNN throughout training
The asymptotic spectrum of the Hessian of DNN throughout training
Arthur Jacot
Franck Gabriel
Clément Hongler
11
34
0
01 Oct 2019
How noise affects the Hessian spectrum in overparameterized neural
  networks
How noise affects the Hessian spectrum in overparameterized neural networks
Ming-Bo Wei
D. Schwab
6
27
0
01 Oct 2019
Asymptotics of Wide Networks from Feynman Diagrams
Asymptotics of Wide Networks from Feynman Diagrams
Ethan Dyer
Guy Gur-Ari
24
113
0
25 Sep 2019
Previous
1234
Next