Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.02015
Cited By
MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving
2 April 2024
Jiangfei Duan
Runyu Lu
Haojie Duanmu
Xiuhong Li
Xingcheng Zhang
Dahua Lin
Ion Stoica
Hao Zhang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving"
3 / 3 papers shown
Title
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location
Ting Sun
Penghan Wang
Fan Lai
66
1
0
15 Jan 2025
MuxFlow: Efficient and Safe GPU Sharing in Large-Scale Production Deep Learning Clusters
Yihao Zhao
Xin Liu
Shufan Liu
Xiang Li
Yibo Zhu
Gang Huang
Xuanzhe Liu
Xin Jin
27
11
0
24 Mar 2023
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Percy Liang
Christopher Ré
Ion Stoica
Ce Zhang
144
365
0
13 Mar 2023
1