Hanayo: Harnessing Wave-like Pipeline Parallelism for Enhanced Large Model Training Efficiency

International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2023

30 August 2023

Ziming Liu

Shenggan Cheng

Hao Zhou

Yang You

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "Hanayo: Harnessing Wave-like Pipeline Parallelism for Enhanced Large Model Training Efficiency"

17 / 17 papers shown

AdaPtis: Reducing Pipeline Bubbles with Adaptive Pipeline Parallelism on Heterogeneous Models

102

28 Sep 2025

Data-Centric Elastic Pipeline Parallelism for Efficient Long-Context LLM Training

157

25 Sep 2025

Kimi K2: Open Agentic Intelligence

...

182

28 Jul 2025

Rethinking Dynamic Networks and Heterogeneous Computing with Automatic ParallelizationAsia-Pacific Workshop on Networking (AN), 2025

178

03 Jun 2025

Ferret: An Efficient Online Continual Learning Framework under Varying Memory ConstraintsComputer Vision and Pattern Recognition (CVPR), 2025

283

15 Mar 2025

PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization

221

03 Mar 2025

FreeRide: Harvesting Bubbles in Pipeline Parallelism

336

11 Sep 2024

Efficient Training of Large Language Models on Distributed Infrastructures: A Survey

...

Dahua Lin

Yonggang Wen

Xin Jin

Tianwei Zhang

Yang Liu

369

29 Jul 2024

WallFacer: Guiding Transformer Model Training Out of the Long-Context Dark Forest with N-body Problem

James Demmel

219

30 Jun 2024

GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism

Sunghyun Park

...

216

24 Jun 2024

Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey

Xiping Hu

339

12 Jun 2024

2BP: 2-Stage Backpropagation

123

28 May 2024

Pipeline Parallelism with Controllable Memory

243

24 May 2024

SlipStream: Adapting Pipelines for Distributed Training of Large DNNs Amid FailuresSymposium on Operating Systems Principles (SOSP), 2024

Swapnil Gandhi

Mark Zhao

Athinagoras Skiadopoulos

Christos Kozyrakis

AI4CE GNN

210

22 May 2024

Checkpoint Merging via Bayesian Optimization in LLM Pretraining

300

28 Mar 2024

Training and Serving System of Foundation Models: A Comprehensive Survey

227

05 Jan 2024

PipeOptim: Ensuring Effective 1F1B Schedule with Optimizer-Dependent Weight PredictionIEEE Transactions on Knowledge and Data Engineering (TKDE), 2023

311

01 Dec 2023