Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.21231
Cited By
ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUs
28 February 2025
Hao Ge
Junda Feng
Qi Huang
Fangcheng Fu
Xiaonan Nie
Lei Zuo
Haibin Lin
Bin Cui
Xin Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUs"
1 / 1 papers shown
Title
OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training
Juntao Zhao
Qi Lu
Wei Jia
Borui Wan
Lei Zuo
...
Y. Hu
Yanghua Peng
H. Lin
Xin Liu
Chuan Wu
AI4CE
32
0
0
14 Apr 2025
1