Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.13262
Cited By
FastMoE: A Fast Mixture-of-Expert Training System
24 March 2021
Jiaao He
J. Qiu
Aohan Zeng
Zhilin Yang
Jidong Zhai
Jie Tang
ALM
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FastMoE: A Fast Mixture-of-Expert Training System"
50 / 63 papers shown
Title
Faster MoE LLM Inference for Extremely Large Models
Haoqi Yang
Luohe Shi
Qiwei Li
Zuchao Li
Ping Wang
Bo Du
Mengjia Shen
Hai Zhao
MoE
61
0
0
06 May 2025
Taming the Titans: A Survey of Efficient LLM Inference Serving
Ranran Zhen
J. Li
Yixin Ji
Z. Yang
Tong Liu
Qingrong Xia
Xinyu Duan
Z. Wang
Baoxing Huai
M. Zhang
LLMAG
77
0
0
28 Apr 2025
Accelerating MoE Model Inference with Expert Sharding
Oana Balmau
Anne-Marie Kermarrec
Rafael Pires
André Loureiro Espírito Santo
M. Vos
Milos Vujasinovic
MoE
64
0
0
11 Mar 2025
eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference
Suraiya Tairin
Shohaib Mahmud
Haiying Shen
Anand Iyer
MoE
120
0
0
10 Mar 2025
Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling
Yan Li
Pengfei Zheng
Shuang Chen
Zewei Xu
Yuanhao Lai
Yunfei Du
Z. Wang
MoE
110
0
0
06 Mar 2025
Sample Selection via Contrastive Fragmentation for Noisy Label Regression
C. Kim
Sangwoo Moon
Jihwan Moon
Dongyeon Woo
Gunhee Kim
NoLa
52
0
0
25 Feb 2025
MoETuner: Optimized Mixture of Expert Serving with Balanced Expert Placement and Token Routing
Seokjin Go
Divya Mahajan
MoE
67
0
0
10 Feb 2025
Importance Sampling via Score-based Generative Models
Heasung Kim
Taekyun Lee
Hyeji Kim
Gustavo de Veciana
MedIm
DiffM
129
1
0
07 Feb 2025
Hiding Communication Cost in Distributed LLM Training via Micro-batch Co-execution
Haiquan Wang
Chaoyi Ruan
Jia He
Jiaqi Ruan
Chengjie Tang
Xiaosong Ma
Cheng-rong Li
73
1
0
24 Nov 2024
Communication-Efficient Sparsely-Activated Model Training via Sequence Migration and Token Condensation
Fahao Chen
Peng Li
Zicong Hong
Zhou Su
Song Guo
MoMe
MoE
67
0
0
23 Nov 2024
HEXA-MoE: Efficient and Heterogeneous-aware MoE Acceleration with ZERO Computation Redundancy
Shuqing Luo
Jie Peng
Pingzhi Li
Tianlong Chen
MoE
31
0
0
02 Nov 2024
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models
Nam V. Nguyen
Thong T. Doan
Luong Tran
Van Nguyen
Quang Pham
MoE
59
1
0
01 Nov 2024
Efficient and Interpretable Grammatical Error Correction with Mixture of Experts
Muhammad Reza Qorib
Alham Fikri Aji
Hwee Tou Ng
KELM
MoE
31
0
0
30 Oct 2024
MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning
Xujia Wang
Haiyan Zhao
Shuo Wang
Hanqing Wang
Zhiyuan Liu
MoMe
MoE
30
0
0
30 Oct 2024
EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference
Yulei Qian
Fengcun Li
Xiangyang Ji
Xiaoyu Zhao
Jianchao Tan
K. Zhang
Xunliang Cai
MoE
68
3
0
16 Oct 2024
Efficient Training of Large Language Models on Distributed Infrastructures: A Survey
Jiangfei Duan
Shuo Zhang
Zerui Wang
Lijuan Jiang
Wenwen Qu
...
Dahua Lin
Yonggang Wen
Xin Jin
Tianwei Zhang
Peng Sun
71
8
0
29 Jul 2024
Powering In-Database Dynamic Model Slicing for Structured Data Analytics
Lingze Zeng
Naili Xing
Shaofeng Cai
Gang Chen
Bengchin Ooi
Jian Pei
Yuncheng Wu
21
1
0
01 May 2024
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models
Songtao Jiang
Tuo Zheng
Yan Zhang
Yeying Jin
Li Yuan
Zuozhu Liu
MoE
29
12
0
16 Apr 2024
Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts
Weilin Cai
Juyong Jiang
Le Qin
Junwei Cui
Sunghun Kim
Jiayi Huang
50
7
0
07 Apr 2024
Vanilla Transformers are Transfer Capability Teachers
Xin Lu
Yanyan Zhao
Bing Qin
MoE
33
0
0
04 Mar 2024
Rethinking RGB Color Representation for Image Restoration Models
Jaerin Lee
J. Park
Sungyong Baik
Kyoung Mu Lee
14
1
0
05 Feb 2024
LocMoE: A Low-Overhead MoE for Large Language Model Training
Jing Li
Zhijie Sun
Xuan He
Li Zeng
Yi Lin
Entong Li
Binfan Zheng
Rongqian Zhao
Xin Chen
MoE
30
11
0
25 Jan 2024
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference
Jinghan Yao
Quentin G. Anthony
A. Shafi
Hari Subramoni
Dhabaleswar K.
Panda
MoE
26
13
0
16 Jan 2024
HAP: SPMD DNN Training on Heterogeneous GPU Clusters with Automated Program Synthesis
Shiwei Zhang
Lansong Diao
Chuan Wu
Zongyan Cao
Siyu Wang
Wei Lin
35
12
0
11 Jan 2024
Training and Serving System of Foundation Models: A Comprehensive Survey
Jiahang Zhou
Yanyu Chen
Zicong Hong
Wuhui Chen
Yue Yu
Tao Zhang
Hui Wang
Chuan-fu Zhang
Zibin Zheng
ALM
32
5
0
05 Jan 2024
Understanding LLMs: A Comprehensive Overview from Training to Inference
Yi-Hsueh Liu
Haoyang He
Tianle Han
Xu-Yao Zhang
Mengyuan Liu
...
Xintao Hu
Tuo Zhang
Ning Qiang
Tianming Liu
Bao Ge
SyDa
19
64
0
04 Jan 2024
OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers
Hanming Liang
Jiacheng Bao
Ruichi Zhang
Sihan Ren
Yuecheng Xu
Sibei Yang
Xin Chen
Jingyi Yu
Lan Xu
42
17
0
14 Dec 2023
Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models
Keming Lu
Hongyi Yuan
Runji Lin
Junyang Lin
Zheng Yuan
Chang Zhou
Jingren Zhou
MoE
LRM
40
52
0
15 Nov 2023
Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer
Boan Liu
Liang Ding
Li Shen
Keqin Peng
Yu Cao
Dazhao Cheng
Dacheng Tao
MoE
34
7
0
15 Oct 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
Pingzhi Li
Zhenyu (Allen) Zhang
Prateek Yadav
Yi-Lin Sung
Yu Cheng
Mohit Bansal
Tianlong Chen
MoMe
26
33
0
02 Oct 2023
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference
Ranggi Hwang
Jianyu Wei
Shijie Cao
Changho Hwang
Xiaohu Tang
Ting Cao
Mao Yang
MoE
45
40
0
23 Aug 2023
Experts Weights Averaging: A New General Training Scheme for Vision Transformers
Yongqian Huang
Peng Ye
Xiaoshui Huang
Sheng R. Li
Tao Chen
Tong He
Wanli Ouyang
MoMe
18
8
0
11 Aug 2023
MediaGPT : A Large Language Model For Chinese Media
Zhonghao Wang
Zijia Lu
Boshen Jin
Haiying Deng
LM&MA
33
0
0
20 Jul 2023
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Saeed Mian
OffRL
46
523
0
12 Jul 2023
ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
Haoran You
Huihong Shi
Yipin Guo
Yingyan Lin
Lin
26
16
0
10 Jun 2023
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts
Rishov Sarkar
Hanxue Liang
Zhiwen Fan
Zhangyang Wang
Cong Hao
MoE
25
17
0
30 May 2023
RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths
Zeyue Xue
Guanglu Song
Qiushan Guo
Boxiao Liu
Zhuofan Zong
Yu Liu
Ping Luo
DiffM
31
132
0
29 May 2023
Lifting the Curse of Capacity Gap in Distilling Language Models
Chen Zhang
Yang Yang
Jiahao Liu
Jingang Wang
Yunsen Xian
Benyou Wang
Dawei Song
MoE
27
19
0
20 May 2023
UKP-SQuARE v3: A Platform for Multi-Agent QA Research
Haritz Puerto
Tim Baumgärtner
Rachneet Sachdeva
Haishuo Fang
Haotian Zhang
Sewin Tariverdian
Kexin Wang
Iryna Gurevych
26
2
0
31 Mar 2023
Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference
Haiyang Huang
Newsha Ardalani
Anna Y. Sun
Liu Ke
Hsien-Hsin S. Lee
Anjali Sridhar
Shruti Bhosale
Carole-Jean Wu
Benjamin C. Lee
MoE
65
23
0
10 Mar 2023
TA-MoE: Topology-Aware Large Scale Mixture-of-Expert Training
Chang-Qin Chen
Min Li
Zhihua Wu
Dianhai Yu
Chao Yang
MoE
6
14
0
20 Feb 2023
Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Ye Bai
Jie Li
W. Han
Hao Ni
Kaituo Xu
Zhuo Zhang
Cheng Yi
Xiaorui Wang
MoE
16
1
0
17 Sep 2022
Efficient Sparsely Activated Transformers
Salar Latifi
Saurav Muralidharan
M. Garland
MoE
8
2
0
31 Aug 2022
Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries
Xiao Liu
Shiyu Zhao
Kai Su
Yukuo Cen
J. Qiu
Mengdi Zhang
Wei Yu Wu
Yuxiao Dong
Jie Tang
33
57
0
16 Aug 2022
Neural Implicit Dictionary via Mixture-of-Expert Training
Peihao Wang
Zhiwen Fan
Tianlong Chen
Zhangyang Wang
11
12
0
08 Jul 2022
Optimizing Mixture of Experts using Dynamic Recompilations
Ferdinand Kossmann
Zhihao Jia
A. Aiken
21
5
0
04 May 2022
Residual Mixture of Experts
Lemeng Wu
Mengchen Liu
Yinpeng Chen
Dongdong Chen
Xiyang Dai
Lu Yuan
MoE
20
36
0
20 Apr 2022
Towards Efficient Single Image Dehazing and Desnowing
Tian-Chun Ye
Sixiang Chen
Yun-Peng Liu
Erkang Chen
Yuche Li
19
7
0
19 Apr 2022
HetuMoE: An Efficient Trillion-scale Mixture-of-Expert Distributed Training System
Xiaonan Nie
Pinxue Zhao
Xupeng Miao
Tong Zhao
Bin Cui
MoE
13
35
0
28 Mar 2022
A World-Self Model Towards Understanding Intelligence
Yutao Yue
19
2
0
25 Mar 2022
1
2
Next