Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.14645
Cited By
Performance Modeling and Workload Analysis of Distributed Large Language Model Training and Inference
19 July 2024
Joyjit Kundu
Wenzhe Guo
Ali BanaGozar
Udari De Alwis
Sourav Sengupta
Puneet Gupta
Arindam Mallik
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Performance Modeling and Workload Analysis of Distributed Large Language Model Training and Inference"
3 / 3 papers shown
Title
MaZO: Masked Zeroth-Order Optimization for Multi-Task Fine-Tuning of Large Language Models
Zhen Zhang
Y. Yang
Kai Zhen
Nathan Susanj
Athanasios Mouchtaris
Siegfried Kunzmann
Zheng Zhang
54
0
0
17 Feb 2025
ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale
William Won
Taekyung Heo
Saeed Rashidi
Srinivas Sridharan
S. Srinivasan
T. Krishna
36
39
0
24 Mar 2023
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,791
0
17 Sep 2019
1