ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.03742
  4. Cited By
Beyond Distillation: Task-level Mixture-of-Experts for Efficient
  Inference

Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference

24 September 2021
Sneha Kudugunta
Yanping Huang
Ankur Bapna
M. Krikun
Dmitry Lepikhin
Minh-Thang Luong
Orhan Firat
    MoE
ArXivPDFHTML

Papers citing "Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference"

7 / 7 papers shown
Title
eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference
Suraiya Tairin
Shohaib Mahmud
Haiying Shen
Anand Iyer
MoE
41
0
0
10 Mar 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
56
1
0
10 Mar 2025
CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory
CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory
Jiashun Suo
Xiaojian Liao
Limin Xiao
Li Ruan
Jinquan Wang
Xiao Su
Zhisheng Huo
52
0
0
04 Mar 2025
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment
Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment
Zhili Liu
Yunhao Gou
Kai Chen
Lanqing Hong
Jiahui Gao
...
Yu Zhang
Zhenguo Li
Xin Jiang
Q. Liu
James T. Kwok
MoE
93
9
0
20 Feb 2025
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Longrong Yang
Dong Shen
Chaoxiang Cai
Fan Yang
Size Li
Di Zhang
Xi Li
MoE
35
2
0
28 Jun 2024
Efficiently Scaling Transformer Inference
Efficiently Scaling Transformer Inference
Reiner Pope
Sholto Douglas
Aakanksha Chowdhery
Jacob Devlin
James Bradbury
Anselm Levskaya
Jonathan Heek
Kefan Xiao
Shivani Agrawal
J. Dean
13
290
0
09 Nov 2022
Investigating Multilingual NMT Representations at Scale
Investigating Multilingual NMT Representations at Scale
Sneha Kudugunta
Ankur Bapna
Isaac Caswell
N. Arivazhagan
Orhan Firat
LRM
128
116
0
05 Sep 2019
1