Optimizing DNN Compilation for Distributed Training with Joint OP and Tensor Fusion

26 September 2022

Zhen Zheng

Papers citing "Optimizing DNN Compilation for Distributed Training with Joint OP and Tensor Fusion"

3 / 3 papers shown

Title
TiMePReSt: Time and Memory Efficient Pipeline Parallel DNN Training with Removed Staleness Ankita Dutta Nabendu Chaki Rajat K. De 27 0 0 18 Oct 2024
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism M. Shoeybi M. Patwary Raul Puri P. LeGresley Jared Casper Bryan Catanzaro MoE 245 1,817 0 17 Sep 2019
Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks Minjie Wang Da Zheng Zihao Ye Quan Gan Mufei Li ... J. Zhao Haotong Zhang Alex Smola Jinyang Li Zheng-Wei Zhang AI4CE GNN 194 745 0 03 Sep 2019