Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.02410
Cited By
Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness
3 October 2023
Young Jin Kim
Raffy Fahim
Hany Awadalla
MQ
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness"
4 / 4 papers shown
Title
D
2
^{2}
2
MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving
Haodong Wang
Qihua Zhou
Zicong Hong
Song Guo
MoE
42
0
0
17 Apr 2025
CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory
Jiashun Suo
Xiaojian Liao
Limin Xiao
Li Ruan
Jinquan Wang
Xiao Su
Zhisheng Huo
55
0
0
04 Mar 2025
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Sneha Kudugunta
Yanping Huang
Ankur Bapna
M. Krikun
Dmitry Lepikhin
Minh-Thang Luong
Orhan Firat
MoE
116
87
0
24 Sep 2021
Scalable and Efficient MoE Training for Multitask Multilingual Models
Young Jin Kim
A. A. Awan
Alexandre Muzio
Andres Felipe Cruz Salinas
Liyang Lu
Amr Hendy
Samyam Rajbhandari
Yuxiong He
Hany Awadalla
MoE
88
82
0
22 Sep 2021
1