ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.05301
  4. Cited By
TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed
  Machine Learning

TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Machine Learning

11 April 2023
William Won
Midhilesh Elavazhagan
S. Srinivasan
A. Durg
Samvit Kaul
Swati Gupta
Tushar Krishna
ArXivPDFHTML

Papers citing "TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Machine Learning"

6 / 6 papers shown
Title
Towards Easy and Realistic Network Infrastructure Testing for Large-scale Machine Learning
Towards Easy and Realistic Network Infrastructure Testing for Large-scale Machine Learning
Jinsun Yoo
ChonLam Lao
Lianjie Cao
Bob Lantz
Minlan Yu
Tushar Krishna
Puneet Sharma
52
0
0
29 Apr 2025
Hiding Communication Cost in Distributed LLM Training via Micro-batch
  Co-execution
Hiding Communication Cost in Distributed LLM Training via Micro-batch Co-execution
Haiquan Wang
Chaoyi Ruan
Jia He
Jiaqi Ruan
Chengjie Tang
Xiaosong Ma
Cheng-rong Li
73
1
0
24 Nov 2024
Towards a Standardized Representation for Deep Learning Collective
  Algorithms
Towards a Standardized Representation for Deep Learning Collective Algorithms
Jinsun Yoo
William Won
Meghan Cowan
Nan Jiang
Benjamin Klenk
Srinivas Sridharan
Tushar Krishna
19
1
0
20 Aug 2024
ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems
  for Large-model Training at Scale
ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale
William Won
Taekyung Heo
Saeed Rashidi
Srinivas Sridharan
S. Srinivasan
T. Krishna
36
43
0
24 Mar 2023
LIBRA: Enabling Workload-aware Multi-dimensional Network Topology
  Optimization for Distributed Training of Large AI Models
LIBRA: Enabling Workload-aware Multi-dimensional Network Topology Optimization for Distributed Training of Large AI Models
William Won
Saeed Rashidi
S. Srinivasan
T. Krishna
AI4CE
12
7
0
24 Sep 2021
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,740
0
26 Sep 2016
1