Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.13510
Cited By
Intelligent Router for LLM Workloads: Improving Performance Through Workload-Aware Scheduling
24 August 2024
Kunal Jain
Anjaly Parayil
Ankur Mallick
Esha Choukse
Xiaoting Qin
Jue Zhang
Íñigo Goiri
Rujia Wang
Chetan Bansal
Victor Rühle
Anoop Kulkarni
Steve Kofsky
Saravan Rajmohan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Intelligent Router for LLM Workloads: Improving Performance Through Workload-Aware Scheduling"
2 / 2 papers shown
Title
Taming the Titans: A Survey of Efficient LLM Inference Serving
Ranran Zhen
J. Li
Yixin Ji
Z. Yang
Tong Liu
Qingrong Xia
Xinyu Duan
Z. Wang
Baoxing Huai
M. Zhang
LLMAG
77
0
0
28 Apr 2025
GenTorrent: Scaling Large Language Model Serving with An Overley Network
Fei Fang
Yifan Hua
Shengze Wang
Ruilin Zhou
Y. Liu
Chen Qian
X. Zhang
46
0
0
27 Apr 2025
1