Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2603.07685
Cited By
v1
v2 (latest)
Scalable Training of Mixture-of-Experts Models with Megatron Core
8 March 2026
Zijie Yan
Hongxiao Bai
Xin Yao
Dennis Liu
Tong Liu
Hongbin Liu
Pingtian Li
Evan Wu
Shiqing Fan
Li Tao
Robin Zhang
Yuzhong Wang
Shifang Xu
Jack Chang
Xuwen Chen
Kunlun Li
Yan Bai
Gao Deng
Nan Zheng
Vijay Anand Korthikanti
Abhinav Khattar
Ethan He
Soham Govande
Sangkug Lym
Zhongbo Zhu
Qi Zhang
Haochen Yuan
Xiaowei Ren
Deyu Fu
Tailai Ma
Shunkang Zhang
Jiang Shao
Ray Wang
Vasudevan Rengasamy
Rachit Garg
Santosh Bhavani
Xipeng Li
Chandler Zhou
David Wu
Yingcan Wei
Ashwath Aithal
Michael Andersch
Mohammad Shoeybi
Jiajie Yao
June Yang
MoE
Re-assign community
ArXiv (abs)
PDF
HTML
Github (15591★)
Papers citing
"Scalable Training of Mixture-of-Experts Models with Megatron Core"
0 / 0 papers shown
No papers found
Page 1 of 0