ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.10197
  4. Cited By
Demystifying the Communication Characteristics for Distributed
  Transformer Models

Demystifying the Communication Characteristics for Distributed Transformer Models

19 August 2024
Quentin G. Anthony
Benjamin Michalowicz
Jacob Hatef
Lang Xu
Mustafa Abduljabbar
A. Shafi
Hari Subramoni
D. Panda
    AI4CE
ArXivPDFHTML

Papers citing "Demystifying the Communication Characteristics for Distributed Transformer Models"

3 / 3 papers shown
Title
GPU-centric Communication Schemes for HPC and ML Applications
GPU-centric Communication Schemes for HPC and ML Applications
Naveen Namashivayam
GNN
25
0
0
31 Mar 2025
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,453
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,817
0
17 Sep 2019
1