Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.00611
Cited By
WallFacer: Guiding Transformer Model Training Out of the Long-Context Dark Forest with N-body Problem
30 June 2024
Ziming Liu
Shaoyu Wang
Shenggan Cheng
Zhongkai Zhao
Xuanlei Zhao
James Demmel
Yang You
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WallFacer: Guiding Transformer Model Training Out of the Long-Context Dark Forest with N-body Problem"
4 / 4 papers shown
Title
An Empirical Survey on Long Document Summarization: Datasets, Models and Metrics
Huan Yee Koh
Jiaxin Ju
Ming Liu
Shirui Pan
73
122
0
03 Jul 2022
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines
Shigang Li
Torsten Hoefler
GNN
AI4CE
LRM
77
130
0
14 Jul 2021
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
160
413
0
18 Jan 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,815
0
17 Sep 2019
1