ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.10365
  4. Cited By
The Early Phase of Neural Network Training

The Early Phase of Neural Network Training

24 February 2020
Jonathan Frankle
D. Schwab
Ari S. Morcos
ArXivPDFHTML

Papers citing "The Early Phase of Neural Network Training"

34 / 34 papers shown
Title
ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Party LLM Data Valuation
ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Party LLM Data Valuation
Yanzhou Pan
Huawei Lin
Yide Ran
Jiamin Chen
Xiaodong Yu
Weijie Zhao
Denghui Zhang
Zhaozhuo Xu
40
0
0
02 Mar 2025
The Silent Majority: Demystifying Memorization Effect in the Presence of Spurious Correlations
The Silent Majority: Demystifying Memorization Effect in the Presence of Spurious Correlations
Chenyu You
Haocheng Dai
Yifei Min
Jasjeet Sekhon
S. Joshi
James S. Duncan
60
2
0
01 Jan 2025
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Zhanpeng Zhou
Mingze Wang
Yuchen Mao
Bingrui Li
Junchi Yan
AAML
62
0
0
14 Oct 2024
Can Optimization Trajectories Explain Multi-Task Transfer?
Can Optimization Trajectories Explain Multi-Task Transfer?
David Mueller
Mark Dredze
Nicholas Andrews
55
1
0
26 Aug 2024
Q-Newton: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent
Q-Newton: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent
Pingzhi Li
Junyu Liu
Hanrui Wang
Tianlong Chen
84
1
0
30 Apr 2024
Towards On-device Learning on the Edge: Ways to Select Neurons to Update
  under a Budget Constraint
Towards On-device Learning on the Edge: Ways to Select Neurons to Update under a Budget Constraint
Ael Quélennec
Enzo Tartaglione
Pavlo Mozharovskyi
Van-Tam Nguyen
26
2
0
08 Dec 2023
Reset It and Forget It: Relearning Last-Layer Weights Improves Continual
  and Transfer Learning
Reset It and Forget It: Relearning Last-Layer Weights Improves Continual and Transfer Learning
Lapo Frati
Neil Traft
Jeff Clune
Nick Cheney
CLL
19
0
0
12 Oct 2023
Accelerating Distributed ML Training via Selective Synchronization
Accelerating Distributed ML Training via Selective Synchronization
S. Tyagi
Martin Swany
FedML
24
3
0
16 Jul 2023
GraVAC: Adaptive Compression for Communication-Efficient Distributed DL
  Training
GraVAC: Adaptive Compression for Communication-Efficient Distributed DL Training
S. Tyagi
Martin Swany
25
4
0
20 May 2023
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Shiwei Liu
Tianlong Chen
Zhenyu (Allen) Zhang
Xuxi Chen
Tianjin Huang
Ajay Jaiswal
Zhangyang Wang
29
29
0
03 Mar 2023
Accelerating Dataset Distillation via Model Augmentation
Accelerating Dataset Distillation via Model Augmentation
Lei Zhang
Jie M. Zhang
Bowen Lei
Subhabrata Mukherjee
Xiang Pan
Bo-Lu Zhao
Caiwen Ding
Y. Li
Dongkuan Xu
DD
37
62
0
12 Dec 2022
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning
  Ticket's Mask?
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
Mansheej Paul
F. Chen
Brett W. Larsen
Jonathan Frankle
Surya Ganguli
Gintare Karolina Dziugaite
UQCV
25
38
0
06 Oct 2022
Stop Wasting My Time! Saving Days of ImageNet and BERT Training with
  Latest Weight Averaging
Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging
Jean Kaddour
MoMe
3DH
19
39
0
29 Sep 2022
Where to Begin? On the Impact of Pre-Training and Initialization in
  Federated Learning
Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning
John Nguyen
Jianyu Wang
Kshitiz Malik
Maziar Sanjabi
Michael G. Rabbat
FedML
AI4CE
23
21
0
30 Jun 2022
Memorization Without Overfitting: Analyzing the Training Dynamics of
  Large Language Models
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Kushal Tirumala
Aram H. Markosyan
Luke Zettlemoyer
Armen Aghajanyan
TDI
23
185
0
22 May 2022
The rise of the lottery heroes: why zero-shot pruning is hard
The rise of the lottery heroes: why zero-shot pruning is hard
Enzo Tartaglione
21
6
0
24 Feb 2022
Cyclical Focal Loss
Cyclical Focal Loss
L. Smith
30
14
0
16 Feb 2022
Eigenvalues of Autoencoders in Training and at Initialization
Eigenvalues of Autoencoders in Training and at Initialization
Ben Dees
S. Agarwala
Corey Lowman
19
0
0
27 Jan 2022
Auto-Compressing Subset Pruning for Semantic Image Segmentation
Auto-Compressing Subset Pruning for Semantic Image Segmentation
Konstantin Ditschuneit
Johannes Otterbach
13
5
0
26 Jan 2022
Neural Capacitance: A New Perspective of Neural Network Selection via
  Edge Dynamics
Neural Capacitance: A New Perspective of Neural Network Selection via Edge Dynamics
Chunheng Jiang
Tejaswini Pedapati
Pin-Yu Chen
Yizhou Sun
Jianxi Gao
21
2
0
11 Jan 2022
When to Prune? A Policy towards Early Structural Pruning
When to Prune? A Policy towards Early Structural Pruning
Maying Shen
Pavlo Molchanov
Hongxu Yin
J. Álvarez
VLM
20
52
0
22 Oct 2021
Rethinking Graph Auto-Encoder Models for Attributed Graph Clustering
Rethinking Graph Auto-Encoder Models for Attributed Graph Clustering
Nairouz Mrabah
Mohamed Bouguessa
M. Touati
Riadh Ksantini
28
62
0
19 Jul 2021
Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win
  the Jackpot?
Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?
Xiaolong Ma
Geng Yuan
Xuan Shen
Tianlong Chen
Xuxi Chen
...
Ning Liu
Minghai Qin
Sijia Liu
Zhangyang Wang
Yanzhi Wang
19
63
0
01 Jul 2021
GANs Can Play Lottery Tickets Too
GANs Can Play Lottery Tickets Too
Xuxi Chen
Zhenyu (Allen) Zhang
Yongduo Sui
Tianlong Chen
GAN
19
58
0
31 May 2021
RATT: Leveraging Unlabeled Data to Guarantee Generalization
RATT: Leveraging Unlabeled Data to Guarantee Generalization
Saurabh Garg
Sivaraman Balakrishnan
J. Zico Kolter
Zachary Chase Lipton
28
29
0
01 May 2021
Sifting out the features by pruning: Are convolutional networks the
  winning lottery ticket of fully connected ones?
Sifting out the features by pruning: Are convolutional networks the winning lottery ticket of fully connected ones?
Franco Pellegrini
Giulio Biroli
46
6
0
27 Apr 2021
Playing Lottery Tickets with Vision and Language
Playing Lottery Tickets with Vision and Language
Zhe Gan
Yen-Chun Chen
Linjie Li
Tianlong Chen
Yu Cheng
Shuohang Wang
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
103
53
0
23 Apr 2021
Recent Advances on Neural Network Pruning at Initialization
Recent Advances on Neural Network Pruning at Initialization
Huan Wang
Can Qin
Yue Bai
Yulun Zhang
Yun Fu
CVBM
31
64
0
11 Mar 2021
Consensus Control for Decentralized Deep Learning
Consensus Control for Decentralized Deep Learning
Lingjing Kong
Tao R. Lin
Anastasia Koloskova
Martin Jaggi
Sebastian U. Stich
19
75
0
09 Feb 2021
Constraints on Hebbian and STDP learned weights of a spiking neuron
Constraints on Hebbian and STDP learned weights of a spiking neuron
Dominique F. Chu
Huy Le Nguyen
15
8
0
14 Dec 2020
What Do Neural Networks Learn When Trained With Random Labels?
What Do Neural Networks Learn When Trained With Random Labels?
Hartmut Maennel
Ibrahim M. Alabdulmohsin
Ilya O. Tolstikhin
R. Baldock
Olivier Bousquet
Sylvain Gelly
Daniel Keysers
FedML
40
86
0
18 Jun 2020
The large learning rate phase of deep learning: the catapult mechanism
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
159
234
0
04 Mar 2020
Selectivity considered harmful: evaluating the causal impact of class
  selectivity in DNNs
Selectivity considered harmful: evaluating the causal impact of class selectivity in DNNs
Matthew L. Leavitt
Ari S. Morcos
55
33
0
03 Mar 2020
DARTS+: Improved Differentiable Architecture Search with Early Stopping
DARTS+: Improved Differentiable Architecture Search with Early Stopping
Hanwen Liang
Shifeng Zhang
Jiacheng Sun
Xingqiu He
Weiran Huang
Kechen Zhuang
Zhenguo Li
14
287
0
13 Sep 2019
1