ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.04959
  4. Cited By
Merak: An Efficient Distributed DNN Training Framework with Automated 3D
  Parallelism for Giant Foundation Models
v1v2v3v4 (latest)

Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models

IEEE Transactions on Parallel and Distributed Systems (TPDS), 2022
10 June 2022
Zhiquan Lai
Shengwei Li
Xudong Tang
Ke-shi Ge
Weijie Liu
Yabo Duan
Linbo Qiao
Dongsheng Li
ArXiv (abs)PDFHTML

Papers citing "Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models"

13 / 13 papers shown
AdaPtis: Reducing Pipeline Bubbles with Adaptive Pipeline Parallelism on Heterogeneous Models
AdaPtis: Reducing Pipeline Bubbles with Adaptive Pipeline Parallelism on Heterogeneous Models
Jihu Guo
Tenghui Ma
Wei Gao
Peng Sun
Jiaxing Li
Xun Chen
Yuyang Jin
Dahua Lin
92
0
0
28 Sep 2025
LeMix: Unified Scheduling for LLM Training and Inference on Multi-GPU Systems
LeMix: Unified Scheduling for LLM Training and Inference on Multi-GPU Systems
Yufei Li
Zexin Li
Yinglun Zhu
Cong Liu
142
2
0
28 Jul 2025
Exploiting Block Coordinate Descent for Cost-Effective LLM Model Training
Exploiting Block Coordinate Descent for Cost-Effective LLM Model Training
Zeyu Liu
Yunquan Zhang
Yunquan Zhang
Boyang Zhang
Guoyong Jiang
Xin Zhang
L. Xiao
Weifeng Zhang
Daning Cheng
264
0
0
23 May 2025
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-OptimizationEuropean Conference on Computer Systems (EuroSys), 2025
Zhanda Zhu
Christina Giannoula
Muralidhar Andoorveedu
Qidong Su
Karttikeya Mangalam
Bojian Zheng
Gennady Pekhimenko
VLMMoE
224
6
0
24 Mar 2025
Real-time and Downtime-tolerant Fault Diagnosis for Railway Turnout
  Machines (RTMs) Empowered with Cloud-Edge Pipeline Parallelism
Real-time and Downtime-tolerant Fault Diagnosis for Railway Turnout Machines (RTMs) Empowered with Cloud-Edge Pipeline Parallelism
Fan Wu
Muhammad Bilal
Haolong Xiang
Heng Wang
Jinjun Yu
Xiaolong Xu
122
1
0
04 Nov 2024
Poplar: Efficient Scaling of Distributed DNN Training on Heterogeneous
  GPU Clusters
Poplar: Efficient Scaling of Distributed DNN Training on Heterogeneous GPU ClustersAAAI Conference on Artificial Intelligence (AAAI), 2024
WenZheng Zhang
Yang Hu
Jing Shi
Xiaoying Bai
165
5
0
22 Aug 2024
Efficient Training of Large Language Models on Distributed
  Infrastructures: A Survey
Efficient Training of Large Language Models on Distributed Infrastructures: A Survey
Jiangfei Duan
Shuo Zhang
Zerui Wang
Lijuan Jiang
Wenwen Qu
...
Dahua Lin
Yonggang Wen
Xin Jin
Tianwei Zhang
Yang Liu
363
32
0
29 Jul 2024
Optimizing Large Model Training through Overlapped Activation Recomputation
Optimizing Large Model Training through Overlapped Activation Recomputation
Ping Chen
Wenjie Zhang
Shuibing He
Yingjie Gu
Zhuwei Peng
...
Yi Zheng
Zhefeng Wang
Yanlong Yin
Gang Chen
Gang Chen
452
9
0
13 Jun 2024
InternEvo: Efficient Long-sequence Large Language Model Training via
  Hybrid Parallelism and Redundant Sharding
InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding
Qiaoling Chen
Diandian Gu
Guoteng Wang
Xun Chen
Yingtong Xiong
...
Qi Hu
Xin Jin
Yonggang Wen
Tianwei Zhang
Yang Liu
300
10
0
17 Jan 2024
Oobleck: Resilient Distributed Training of Large Models Using Pipeline
  Templates
Oobleck: Resilient Distributed Training of Large Models Using Pipeline TemplatesSymposium on Operating Systems Principles (SOSP), 2023
Insu Jang
Zhenning Yang
Zhen Zhang
Xin Jin
Mosharaf Chowdhury
MoEAI4CEOODD
257
77
0
15 Sep 2023
Automated Tensor Model Parallelism with Overlapped Communication for
  Efficient Foundation Model Training
Automated Tensor Model Parallelism with Overlapped Communication for Efficient Foundation Model TrainingIEEE Transactions on Parallel and Distributed Systems (TPDS), 2023
Shengwei Li
Zhiquan Lai
Yanqi Hao
Weijie Liu
Ke-shi Ge
Xiaoge Deng
Dongsheng Li
KaiCheng Lu
174
11
0
25 May 2023
A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs
A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs
Siddharth Singh
Prajwal Singhania
Aditya K. Ranjan
Zack Sating
A. Bhatele
229
6
0
22 May 2023
Colossal-Auto: Unified Automation of Parallelization and Activation
  Checkpoint for Large-scale Models
Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models
Yuliang Liu
Shenggui Li
Jiarui Fang
Yan Shao
Boyuan Yao
Yang You
OffRL
208
11
0
06 Feb 2023
1