ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.16677
  4. Cited By
T3: Transparent Tracking & Triggering for Fine-grained Overlap of
  Compute & Collectives

T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives

30 January 2024
Suchita Pati
Shaizeen Aga
Mahzabeen Islam
Nuwan Jayasena
Matthew D. Sinclair
ArXivPDFHTML

Papers citing "T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives"

8 / 8 papers shown
Title
MSCCL++: Rethinking GPU Communication Abstractions for Cutting-edge AI Applications
MSCCL++: Rethinking GPU Communication Abstractions for Cutting-edge AI Applications
Aashaka Shah
Abhinav Jangda
B. Li
Caio Rocha
Changho Hwang
...
Peng Cheng
Qinghua Zhou
Roshan Dathathri
Saeed Maleki
Ziyue Yang
GNN
47
0
0
11 Apr 2025
Importance Sampling via Score-based Generative Models
Importance Sampling via Score-based Generative Models
Heasung Kim
Taekyun Lee
Hyeji Kim
Gustavo de Veciana
MedIm
DiffM
110
1
0
07 Feb 2025
Reducing the Cost of Dropout in Flash-Attention by Hiding RNG with GEMM
Reducing the Cost of Dropout in Flash-Attention by Hiding RNG with GEMM
Haiyue Ma
Jian Liu
Ronny Krashinsky
16
0
0
10 Oct 2024
ISO: Overlap of Computation and Communication within Seqenence For LLM
  Inference
ISO: Overlap of Computation and Communication within Seqenence For LLM Inference
Bin Xiao
Lei Su
21
0
0
04 Sep 2024
Efficient Training of Large Language Models on Distributed
  Infrastructures: A Survey
Efficient Training of Large Language Models on Distributed Infrastructures: A Survey
Jiangfei Duan
Shuo Zhang
Zerui Wang
Lijuan Jiang
Wenwen Qu
...
Dahua Lin
Yonggang Wen
Xin Jin
Tianwei Zhang
Peng Sun
69
7
0
29 Jul 2024
Optimizing Distributed ML Communication with Fused
  Computation-Collective Operations
Optimizing Distributed ML Communication with Fused Computation-Collective Operations
Kishore Punniyamurthy
Khaled Hamidouche
Bradford M. Beckmann
FedML
21
8
0
11 May 2023
Scalable and Efficient MoE Training for Multitask Multilingual Models
Scalable and Efficient MoE Training for Multitask Multilingual Models
Young Jin Kim
A. A. Awan
Alexandre Muzio
Andres Felipe Cruz Salinas
Liyang Lu
Amr Hendy
Samyam Rajbhandari
Yuxiong He
Hany Awadalla
MoE
88
82
0
22 Sep 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,791
0
17 Sep 2019
1