Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.08423
Cited By
DistSim: A performance model of large-scale hybrid distributed DNN training
14 June 2023
Guandong Lu
Run Chen
Yakai Wang
Yangjie Zhou
Rui Zhang
Zheng Hu
Yanming Miao
Zhifang Cai
Li-Wei Li
Jingwen Leng
Minyi Guo
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DistSim: A performance model of large-scale hybrid distributed DNN training"
5 / 5 papers shown
Title
Phantora: Live GPU Cluster Simulation for Machine Learning System Performance Estimation
Jianxing Qin
Jingrong Chen
Xinhao Kong
Yongji Wu
Liang Luo
Z. Wang
Ying Zhang
Tingjun Chen
Alvin R. Lebeck
Danyang Zhuo
45
0
0
02 May 2025
Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training
Cong Guo
Yuxian Qiu
Jingwen Leng
Chen Zhang
Yingdian Cao
Quan Zhang
Yunxin Liu
Fan Yang
Minyi Guo
AI4CE
62
4
0
22 Sep 2022
SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation
Cong Guo
Yuxian Qiu
Jingwen Leng
Xiaotian Gao
Chen Zhang
Yunxin Liu
Fan Yang
Yuhao Zhu
Minyi Guo
MQ
58
67
0
14 Feb 2022
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines
Shigang Li
Torsten Hoefler
GNN
AI4CE
LRM
77
130
0
14 Jul 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,791
0
17 Sep 2019
1