ACCO: Accumulate while you Communicate, Hiding Communications in Distributed LLM Training

3 June 2024

Papers citing "ACCO: Accumulate while you Communicate, Hiding Communications in Distributed LLM Training"

5 / 5 papers shown

Title
CO2: Efficient Distributed Training with Full Communication-Computation Overlap Weigao Sun Zhen Qin Weixuan Sun Shidi Li Dong Li Xuyang Shen Yu Qiao Yiran Zhong OffRL 53 10 0 29 Jan 2024
Delay-adaptive step-sizes for asynchronous learning Xuyang Wu Sindri Magnússon Hamid Reza Feyzmahdavian M. Johansson 23 14 0 17 Feb 2022
ZeRO-Offload: Democratizing Billion-Scale Model Training Jie Ren Samyam Rajbhandari Reza Yazdani Aminabadi Olatunji Ruwase Shuangyang Yang Minjia Zhang Dong Li Yuxiong He MoE 160 413 0 18 Jan 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 245 1,977 0 31 Dec 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism M. Shoeybi M. Patwary Raul Puri P. LeGresley Jared Casper Bryan Catanzaro MoE 243 1,815 0 17 Sep 2019