Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.05610
Cited By
Efficient Parallelization Layouts for Large-Scale Distributed Model Training
9 November 2023
Johannes Hagemann
Samuel Weinbach
Konstantin Dobler
Maximilian Schall
Gerard de Melo
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Parallelization Layouts for Large-Scale Distributed Model Training"
3 / 3 papers shown
Title
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
Jared Fernandez
Luca Wehrstedt
Leonid Shamis
Mostafa Elhoushi
Kalyan Saladi
Yonatan Bisk
Emma Strubell
Jacob Kahn
148
3
0
20 Nov 2024
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
245
695
0
27 Aug 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,817
0
17 Sep 2019
1