Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.08098
Cited By
GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training
16 February 2021
Chen Zhu
Renkun Ni
Zheng Xu
Kezhi Kong
W. R. Huang
Tom Goldstein
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training"
12 / 12 papers shown
Title
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
13
41
0
12 Jul 2023
Convex Dual Theory Analysis of Two-Layer Convolutional Neural Networks with Soft-Thresholding
Chunyan Xiong
Meng Lu
Xiaotong Yu
JIAN-PENG Cao
Zhong Chen
D. Guo
X. Qu
MLT
33
0
0
14 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
24
39
0
07 Apr 2023
Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?
Boris Knyazev
Doha Hwang
Simon Lacoste-Julien
AI4CE
24
17
0
07 Mar 2023
CyclicFL: A Cyclic Model Pre-Training Approach to Efficient Federated Learning
Peng Zhang
Yingbo Zhou
Ming Hu
Xin Fu
Xian Wei
Mingsong Chen
FedML
24
1
0
28 Jan 2023
NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction
Yun Yi
Haokui Zhang
Wenze Hu
Nannan Wang
Xiaoyu Wang
AI4TS
AI4CE
19
8
0
15 Nov 2022
MetaFormer Baselines for Vision
Weihao Yu
Chenyang Si
Pan Zhou
Mi Luo
Yichen Zhou
Jiashi Feng
Shuicheng Yan
Xinchao Wang
MoE
23
156
0
24 Oct 2022
Dynamical Isometry for Residual Networks
Advait Gadhikar
R. Burkholz
ODL
AI4CE
29
2
0
05 Oct 2022
NormFormer: Improved Transformer Pretraining with Extra Normalization
Sam Shleifer
Jason Weston
Myle Ott
AI4CE
28
74
0
18 Oct 2021
AutoInit: Analytic Signal-Preserving Weight Initialization for Neural Networks
G. Bingham
Risto Miikkulainen
ODL
18
4
0
18 Sep 2021
Data-driven Weight Initialization with Sylvester Solvers
Debasmit Das
Yash Bhalgat
Fatih Porikli
ODL
17
3
0
02 May 2021
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
223
512
0
11 Feb 2021
1