Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2201.11840
Cited By
v1
v2
v3 (latest)
GC3: An Optimizing Compiler for GPU Collective Communication
27 January 2022
M. Cowan
Saeed Maleki
Madan Musuvathi
Olli Saarikivi
Yifan Xiong
GNN
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"GC3: An Optimizing Compiler for GPU Collective Communication"
6 / 6 papers shown
Title
Efficient Training of Large Language Models on Distributed Infrastructures: A Survey
Jiangfei Duan
Shuo Zhang
Zerui Wang
Lijuan Jiang
Wenwen Qu
...
Dahua Lin
Yonggang Wen
Xin Jin
Tianwei Zhang
Peng Sun
141
12
0
29 Jul 2024
TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Machine Learning
William Won
Suvinay Subramanian
Sudarshan Srinivasan
A. Durg
Samvit Kaul
Swati Gupta
Tushar Krishna
86
7
0
11 Apr 2023
On Optimizing the Communication of Model Parallelism
Yonghao Zhuang
Hexu Zhao
Lianmin Zheng
Zhuohan Li
Eric P. Xing
Qirong Ho
Joseph E. Gonzalez
Ion Stoica
Haotong Zhang
111
26
0
10 Nov 2022
Impact of RoCE Congestion Control Policies on Distributed Training of DNNs
Tarannum Khan
Saeed Rashidi
Srinivas Sridharan
Pallavi Shurpali
Aditya Akella
T. Krishna
OOD
75
11
0
22 Jul 2022
Efficient Direct-Connect Topologies for Collective Communications
Liangyu Zhao
Siddharth Pal
Tapan Chugh
Weiyang Wang
Jason Fantl
P. Basu
J. Khoury
Arvind Krishnamurthy
86
7
0
07 Feb 2022
Themis: A Network Bandwidth-Aware Collective Scheduling Policy for Distributed Training of DL Models
Saeed Rashidi
William Won
Sudarshan Srinivasan
Srinivas Sridharan
T. Krishna
GNN
88
34
0
09 Oct 2021
1