Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.14257
Cited By
Revisiting SLO and Goodput Metrics in LLM Serving
18 October 2024
Zhibin Wang
Shipeng Li
Yuhang Zhou
Xue Li
Rong Gu
Nguyen Cam-Tu
Chen Tian
Sheng Zhong
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Revisiting SLO and Goodput Metrics in LLM Serving"
3 / 3 papers shown
Title
Faster MoE LLM Inference for Extremely Large Models
Haoqi Yang
Luohe Shi
Qiwei Li
Zuchao Li
Ping Wang
Bo Du
Mengjia Shen
Hai Zhao
MoE
59
0
0
06 May 2025
Prism: Unleashing GPU Sharing for Cost-Efficient Multi-LLM Serving
Shan Yu
Jiarong Xing
Yifan Qiao
Mingyuan Ma
Y. Li
...
Shiyi Cao
Ke Bao
Ion Stoica
Harry Xu
Ying Sheng
21
0
0
06 May 2025
Tempo: Application-aware LLM Serving with Mixed SLO Requirements
Wei Zhang
Zhiyu Wu
Yi Mu
Banruo Liu
Myungjin Lee
Fan Lai
51
0
0
24 Apr 2025
1