Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.01433
Cited By
HOBBIT: A Mixed Precision Expert Offloading System for Fast MoE Inference
3 November 2024
Peng Tang
Jiacheng Liu
X. Hou
Yifei Pu
Jing Wang
Pheng-Ann Heng
C. Li
M. Guo
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"HOBBIT: A Mixed Precision Expert Offloading System for Fast MoE Inference"
2 / 2 papers shown
Title
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design
Haojie Duanmu
Xiuhong Li
Zhihang Yuan
Size Zheng
Jiangfei Duan
Xingcheng Zhang
Dahua Lin
MQ
MoE
43
0
0
09 May 2025
FloE: On-the-Fly MoE Inference on Memory-constrained GPU
Yuxin Zhou
Zheng Li
J. Zhang
Jue Wang
Y. Wang
Zhongle Xie
Ke Chen
Lidan Shou
MoE
29
0
0
09 May 2025
1