ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.03382
  4. Cited By
Tutel: Adaptive Mixture-of-Experts at Scale

Tutel: Adaptive Mixture-of-Experts at Scale

7 June 2022
Changho Hwang
Wei Cui
Yifan Xiong
Ziyue Yang
Ze Liu
Han Hu
Zilong Wang
Rafael Salas
Jithin Jose
Prabhat Ram
Joe Chau
Peng Cheng
Fan Yang
Mao Yang
Y. Xiong
    MoE
ArXivPDFHTML

Papers citing "Tutel: Adaptive Mixture-of-Experts at Scale"

5 / 5 papers shown
Title
Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields
Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields
Zhenxing Mi
Ping Yin
Xue Xiao
Dan Xu
MoE
14
0
0
04 May 2025
Accelerating Mixture-of-Experts Training with Adaptive Expert Replication
Accelerating Mixture-of-Experts Training with Adaptive Expert Replication
Athinagoras Skiadopoulos
Mark Zhao
Swapnil Gandhi
Thomas Norrie
Shrijeet Mukherjee
Christos Kozyrakis
MoE
68
50
0
28 Apr 2025
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion
  Parameter Pretraining
M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Junyang Lin
An Yang
Jinze Bai
Chang Zhou
Le Jiang
...
Jie M. Zhang
Yong Li
Wei Lin
Jingren Zhou
Hongxia Yang
MoE
71
36
0
08 Oct 2021
Scalable and Efficient MoE Training for Multitask Multilingual Models
Scalable and Efficient MoE Training for Multitask Multilingual Models
Young Jin Kim
A. A. Awan
Alexandre Muzio
Andres Felipe Cruz Salinas
Liyang Lu
Amr Hendy
Samyam Rajbhandari
Yuxiong He
Hany Awadalla
MoE
75
69
0
22 Sep 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
215
3,054
0
23 Jan 2020
1