ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.09075
  4. Cited By
An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of
  Convolutional Neural Networks

An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of Convolutional Neural Networks

19 April 2021
A. Kahira
Truong Thao Nguyen
L. Bautista-Gomez
Ryousei Takano
Rosa M. Badia
M. Wahib
ArXivPDFHTML

Papers citing "An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of Convolutional Neural Networks"

4 / 4 papers shown
Title
Optimizing CNN Using HPC Tools
Optimizing CNN Using HPC Tools
Shahrin Rahman
11
0
0
07 Mar 2024
Monitoring Collective Communication Among GPUs
Monitoring Collective Communication Among GPUs
Muhammet Abdullah Soytürk
Palwisha Akhtar
Erhan Tezcan
D. Unat
GNN
15
1
0
20 Oct 2021
Scaling Distributed Deep Learning Workloads beyond the Memory Capacity
  with KARMA
Scaling Distributed Deep Learning Workloads beyond the Memory Capacity with KARMA
M. Wahib
Haoyu Zhang
Truong Thao Nguyen
Aleksandr Drozd
Jens Domke
Lingqi Zhang
Ryousei Takano
Satoshi Matsuoka
OODD
32
23
0
26 Aug 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,817
0
17 Sep 2019
1