Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.12661
Cited By
v1
v2
v3 (latest)
ZerO Initialization: Initializing Neural Networks with only Zeros and Ones
25 October 2021
Jiawei Zhao
Florian Schäfer
Anima Anandkumar
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ZerO Initialization: Initializing Neural Networks with only Zeros and Ones"
11 / 11 papers shown
Title
Principled Approaches for Extending Neural Architectures to Function Spaces for Operator Learning
Julius Berner
Miguel Liu-Schiaffini
Jean Kossaifi
Valentin Duruisseaux
Boris Bonev
Kamyar Azizzadenesheli
A. Anandkumar
AI4CE
116
0
0
12 Jun 2025
MLorc: Momentum Low-rank Compression for Large Language Model Adaptation
Wei Shen
Zhang Yaxiang
Minhui Huang
Mengfan Xu
Jiawei Zhang
Cong Shen
AI4CE
44
0
0
02 Jun 2025
Protocol Models: Scaling Decentralized Training with Communication-Efficient Model Parallelism
Sameera Ramasinghe
Thalaiyasingam Ajanthan
Gil Avraham
Yan Zuo
Alexander Long
GNN
69
0
0
02 Jun 2025
SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training
Yehonathan Refael
Guy Smorodinsky
Tom Tirer
Ofir Lindenbaum
32
0
0
30 May 2025
ASGO: Adaptive Structured Gradient Optimization
Kang An
Yuxing Liu
Boyao Wang
Shiqian Ma
Shiqian Ma
Tong Zhang
Tong Zhang
ODL
150
5
0
26 Mar 2025
A Good Start Matters: Enhancing Continual Learning with Data-Driven Weight Initialization
Md Yousuf Harun
Christopher Kanan
AI4CE
93
0
0
09 Mar 2025
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
Yehonathan Refael
Jonathan Svirsky
Boris Shustin
Wasim Huleihel
Ofir Lindenbaum
101
4
0
31 Dec 2024
Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis
Hyunwoo Lee
Hayoung Choi
Hyunju Kim
72
2
0
03 Oct 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
Matthew E. Taylor
OffRL
126
2
0
30 Apr 2024
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Jiawei Zhao
Zhenyu Zhang
Beidi Chen
Zhangyang Wang
A. Anandkumar
Yuandong Tian
106
229
0
06 Mar 2024
Nonparametric Learning of Two-Layer ReLU Residual Units
Zhunxuan Wang
Linyun He
Chunchuan Lyu
Shay B. Cohen
MLT
OffRL
199
1
0
17 Aug 2020
1