Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.09716
Cited By
MoE-Gen: High-Throughput MoE Inference on a Single GPU with Module-Based Batching
12 March 2025
Tairan Xu
Leyang Xue
Zhan Lu
Adrian Jackson
Luo Mai
MoE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MoE-Gen: High-Throughput MoE Inference on a Single GPU with Module-Based Batching"
1 / 1 papers shown
Title
MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource Constraints
Yichao Yuan
Lin Ma
Nishil Talati
MoE
54
0
0
12 Apr 2025
1