ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.14599
  4. Cited By
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural
  Networks

The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks

25 June 2020
Wei Hu
Lechao Xiao
Ben Adlam
Jeffrey Pennington
ArXivPDFHTML

Papers citing "The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks"

18 / 18 papers shown
Title
How Do Large Language Models Acquire Factual Knowledge During
  Pretraining?
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Hoyeon Chang
Jinho Park
Seonghyeon Ye
Sohee Yang
Youngkyung Seo
Du-Seong Chang
Minjoon Seo
KELM
37
32
0
17 Jun 2024
Hypernetwork-based Meta-Learning for Low-Rank Physics-Informed Neural
  Networks
Hypernetwork-based Meta-Learning for Low-Rank Physics-Informed Neural Networks
Woojin Cho
Kookjin Lee
Donsub Rim
Noseong Park
AI4CE
PINN
34
16
0
14 Oct 2023
Robust Sparse Mean Estimation via Incremental Learning
Robust Sparse Mean Estimation via Incremental Learning
Jianhao Ma
Ruidi Chen
Yinghui He
S. Fattahi
Wei Hu
36
0
0
24 May 2023
Understanding the Initial Condensation of Convolutional Neural Networks
Understanding the Initial Condensation of Convolutional Neural Networks
Zhangchen Zhou
Hanxu Zhou
Yuqing Li
Zhi-Qin John Xu
MLT
AI4CE
23
5
0
17 May 2023
Beyond Distribution Shift: Spurious Features Through the Lens of
  Training Dynamics
Beyond Distribution Shift: Spurious Features Through the Lens of Training Dynamics
Nihal Murali
A. Puli
Ke Yu
Rajesh Ranganath
Kayhan Batmanghelich
AAML
40
8
0
18 Feb 2023
Understanding Incremental Learning of Gradient Descent: A Fine-grained
  Analysis of Matrix Sensing
Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
Jikai Jin
Zhiyuan Li
Kaifeng Lyu
S. Du
Jason D. Lee
MLT
51
34
0
27 Jan 2023
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work
When Expressivity Meets Trainability: Fewer than nnn Neurons Can Work
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Ruoyu Sun
Zhi-Quan Luo
26
10
0
21 Oct 2022
On the Activation Function Dependence of the Spectral Bias of Neural
  Networks
On the Activation Function Dependence of the Spectral Bias of Neural Networks
Q. Hong
Jonathan W. Siegel
Qinyan Tan
Jinchao Xu
34
22
0
09 Aug 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge
  of Stability
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
19
42
0
26 Jul 2022
Neural Collapse: A Review on Modelling Principles and Generalization
Neural Collapse: A Review on Modelling Principles and Generalization
Vignesh Kothapalli
21
71
0
08 Jun 2022
Identifying good directions to escape the NTK regime and efficiently
  learn low-degree plus sparse polynomials
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Eshaan Nichani
Yunzhi Bai
Jason D. Lee
27
10
0
08 Jun 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep
  Convolutional Neural Networks
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
Noam Razin
Asaf Maman
Nadav Cohen
46
29
0
27 Jan 2022
Overview frequency principle/spectral bias in deep learning
Overview frequency principle/spectral bias in deep learning
Z. Xu
Yaoyu Zhang
Tao Luo
FaML
30
65
0
19 Jan 2022
Deep Learning Through the Lens of Example Difficulty
Deep Learning Through the Lens of Example Difficulty
R. Baldock
Hartmut Maennel
Behnam Neyshabur
47
155
0
17 Jun 2021
FEAR: A Simple Lightweight Method to Rank Architectures
FEAR: A Simple Lightweight Method to Rank Architectures
Debadeepta Dey
Shital C. Shah
Sébastien Bubeck
OOD
22
4
0
07 Jun 2021
RATT: Leveraging Unlabeled Data to Guarantee Generalization
RATT: Leveraging Unlabeled Data to Guarantee Generalization
Saurabh Garg
Sivaraman Balakrishnan
J. Zico Kolter
Zachary Chase Lipton
28
30
0
01 May 2021
The large learning rate phase of deep learning: the catapult mechanism
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
159
234
0
04 Mar 2020
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train
  10,000-Layer Vanilla Convolutional Neural Networks
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
224
348
0
14 Jun 2018
1