Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.18665
Cited By
RouteLLM: Learning to Route LLMs with Preference Data
26 June 2024
Isaac Ong
Amjad Almahairi
Vincent Wu
Wei-Lin Chiang
Tianhao Wu
Joseph E. Gonzalez
M. W. Kadous
Ion Stoica
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RouteLLM: Learning to Route LLMs with Preference Data"
50 / 54 papers shown
Title
Long-Short Chain-of-Thought Mixture Supervised Fine-Tuning Eliciting Efficient Reasoning in Large Language Models
Bin Yu
Hang Yuan
Yuliang Wei
Bailing Wang
Weizhen Qi
Kai Chen
LRM
27
0
0
06 May 2025
Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance
Takuya Tamura
Taro Yano
Masafumi Enomoto
M. Oyamada
34
0
0
28 Apr 2025
Exploring How LLMs Capture and Represent Domain-Specific Knowledge
Mirian Hipolito Garcia
Camille Couturier
Daniel Madrigal Diaz
Ankur Mallick
Anastasios Kyrillidis
Robert Sim
Victor Rühle
Saravan Rajmohan
22
0
0
23 Apr 2025
Dynamic Early Exit in Reasoning Models
Chenxu Yang
Qingyi Si
Yongjie Duan
Zheliang Zhu
Chenyu Zhu
Zheng-Shen Lin
Li Cao
Weiping Wang
ReLM
LRM
28
0
0
22 Apr 2025
Efficient Reasoning Models: A Survey
Sicheng Feng
Gongfan Fang
Xinyin Ma
Xinchao Wang
ReLM
LRM
53
0
0
15 Apr 2025
Reasoning Models Can Be Effective Without Thinking
Wenjie Ma
Jingxuan He
Charlie Snell
Tyler Griggs
Sewon Min
Matei A. Zaharia
ReLM
LRM
42
4
1
14 Apr 2025
EMAFusion: A Self-Optimizing System for Seamless LLM Selection and Integration
Soham Shah
Kumar Shridhar
Surojit Chatterjee
Souvik Sen
32
0
0
14 Apr 2025
Toward Super Agent System with Hybrid AI Routers
Yuhang Yao
Haixin Wang
Yibo Chen
Jiawen Wang
Min Chang Jordan Ren
Bosheng Ding
Salman Avestimehr
Chaoyang He
LLMAG
LM&Ro
31
0
0
11 Apr 2025
Self-Resource Allocation in Multi-Agent LLM Systems
Alfonso Amayuelas
Jingbo Yang
Saaket Agashe
Ashwin Nagarajan
Antonis Antoniades
X. Wang
William Wang
84
0
0
02 Apr 2025
Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
Jianhao Chen
Zishuo Xun
Bocheng Zhou
Han Qi
Qiaosheng Zhang
...
Wei Hu
Yuzhong Qu
W. Ouyang
Wanli Ouyang
Shuyue Hu
74
0
0
01 Apr 2025
HERA: Hybrid Edge-cloud Resource Allocation for Cost-Efficient AI Agents
Shiyi Liu
Haiying Shen
Shuai Che
Mahdi Ghandi
Mingqin Li
LLMAG
48
0
0
01 Apr 2025
Efficient Inference for Large Reasoning Models: A Survey
Y. Liu
Jiaying Wu
Yufei He
Hongcheng Gao
Hongyu Chen
Baolong Bi
Jiaheng Zhang
Zhiqi Huang
Bryan Hooi
LLMAG
LRM
58
7
0
29 Mar 2025
EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing
Yizhang Zhu
Runzhi Jiang
Boyan Li
Nan Tang
Yuyu Luo
34
0
0
28 Mar 2025
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
Xiaoye Qu
Yafu Li
Zhaochen Su
Weigao Sun
Jianhao Yan
...
Chaochao Lu
Yue Zhang
Xian-Sheng Hua
Bowen Zhou
Yu Cheng
ReLM
OffRL
LRM
76
11
0
27 Mar 2025
Harnessing Chain-of-Thought Metadata for Task Routing and Adversarial Prompt Detection
Ryan Marinelli
Josef Pichlmeier
Tamás Bisztray
LRM
31
0
0
27 Mar 2025
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
Yang Sui
Yu-Neng Chuang
Guanchu Wang
Jiamu Zhang
Tianyi Zhang
...
Hongyi Liu
Andrew Wen
Shaochen
Zhong
Hanjie Chen
OffRL
ReLM
LRM
60
21
0
20 Mar 2025
How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities
Aly M. Kassem
Bernhard Schölkopf
Zhijing Jin
24
0
0
20 Mar 2025
Line of Duty: Evaluating LLM Self-Knowledge via Consistency in Feasibility Boundaries
Sahil Kale
Vijaykant Nadadur
38
0
0
14 Mar 2025
G-Boost: Boosting Private SLMs with General LLMs
Yijiang Fan
Yuren Mao
Longbin Lai
Ying Zhang
Zhengping Qian
Yunjun Gao
36
0
0
13 Mar 2025
Queueing, Predictions, and LLMs: Challenges and Open Problems
Michael Mitzenmacher
Rana Shahout
AI4TS
LRM
31
1
0
10 Mar 2025
Life-Cycle Routing Vulnerabilities of LLM Router
Qiqi Lin
Xiaoyang Ji
Shengfang Zhai
Qingni Shen
Zhi-Li Zhang
Yuejian Fang
Yansong Gao
AAML
52
0
0
09 Mar 2025
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang
Guoming Ling
Vincent S. Liang
Yupei Lin
Yandong Chen
Shanshan Zhong
Hefeng Wu
Liang Lin
LRM
52
1
0
08 Mar 2025
When does a predictor know its own loss?
Aravind Gollakota
Parikshit Gopalan
Aayush Karan
Charlotte Peale
Udi Wieder
UQCV
FaML
53
0
0
27 Feb 2025
Harnessing Multiple Large Language Models: A Survey on LLM Ensemble
Zhijun Chen
Jingzheng Li
Pengpeng Chen
Zhuoran Li
Kai Sun
Yuankai Luo
Qianren Mao
Dingqi Yang
Hailong Sun
Philip S. Yu
ELM
50
2
0
25 Feb 2025
Translate Smart, not Hard: Cascaded Translation Systems with Quality-Aware Deferral
António Farinhas
Nuno M. Guerreiro
Sweta Agrawal
Ricardo Rei
André F. T. Martins
45
0
0
18 Feb 2025
A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck
Maximilian Baader
Martin Vechev
60
2
0
17 Feb 2025
Leveraging Uncertainty Estimation for Efficient LLM Routing
Tuo Zhang
Asal Mehradfar
Dimitrios Dimitriadis
Salman Avestimehr
46
0
0
16 Feb 2025
Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding
Z. Wang
Muneeza Azmart
Ang Li
R. Horesh
Mikhail Yurochkin
104
0
0
11 Feb 2025
MixLLM: Dynamic Routing in Mixed Large Language Models
Xinyuan Wang
Yanchi Liu
Wei Cheng
Xujiang Zhao
Z. Chen
Wenchao Yu
Yanjie Fu
Haifeng Chen
43
2
0
09 Feb 2025
Recommendations Beyond Catalogs: Diffusion Models for Personalized Generation
Gabriel Patron
Zhiwei Xu
Ishan Kapnadak
Felipe Maia Polo
DiffM
38
0
0
05 Feb 2025
Predictable Artificial Intelligence
Lexin Zhou
Pablo Antonio Moreno Casares
Fernando Martínez-Plumed
John Burden
Ryan Burnell
...
Seán Ó hÉigeartaigh
Danaja Rutar
Wout Schellaert
Konstantinos Voudouris
José Hernández Orallo
31
2
0
08 Jan 2025
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALM
LRM
76
207
0
03 Jan 2025
Efficiently Serving LLM Reasoning Programs with Certaindex
Yichao Fu
Junda Chen
Siqi Zhu
Zheyu Fu
Zhongdongming Dai
Aurick Qiao
Hao Zhang
LRM
46
12
0
31 Dec 2024
PickLLM: Context-Aware RL-Assisted Large Language Model Routing
Dimitrios Sikeridis
Dennis Ramdass
Pranay Pareek
75
1
0
12 Dec 2024
Bench-CoE: a Framework for Collaboration of Experts from Benchmark
Yuanshuai Wang
Xingjian Zhang
Jinkun Zhao
Siwei Wen
Peilin Feng
Shuhao Liao
Lei Huang
Wenjun Wu
MoE
ALM
78
2
0
05 Dec 2024
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELM
AILaw
106
61
0
25 Nov 2024
Towards Optimizing SQL Generation via LLM Routing
Mohammadhossein Malekpour
Nour Shaheen
Foutse Khomh
Amine Mhedhbi
AI4TS
31
2
0
06 Nov 2024
Accelerated AI Inference via Dynamic Execution Methods
Haim Barad
Jascha Achterberg
Tien Pei Chou
Jean Yu
26
0
0
30 Oct 2024
MoDEM: Mixture of Domain Expert Models
Toby Simonds
K. K.
Jey Han Lau
MoE
21
1
0
09 Oct 2024
Glider: Global and Local Instruction-Driven Expert Router
Pingzhi Li
Prateek Yadav
Jaehong Yoon
Jie Peng
Yi-Lin Sung
Mohit Bansal
Tianlong Chen
MoMe
MoE
25
1
0
09 Oct 2024
Towards AI-Native Software Engineering (SE 3.0): A Vision and a Challenge Roadmap
Ahmed E. Hassan
G. Oliva
Dayi Lin
Boyuan Chen
Zhen Ming
Jiang
31
4
0
08 Oct 2024
Rational Metareasoning for Large Language Models
C. Nicolò De Sabbata
T. Sumers
Thomas L. Griffiths
ReLM
LRM
28
1
0
07 Oct 2024
LLMProxy: Reducing Cost to Access Large Language Models
Noah Martin
Abdullah Bin Faisal
Hiba Eltigani
Rukhshan Haroon
Swaminathan Lamelas
Fahad Dogar
29
1
0
04 Oct 2024
EmbedLLM: Learning Compact Representations of Large Language Models
Richard Zhuang
Tianhao Wu
Zhaojin Wen
Andrew Li
Jiantao Jiao
Kannan Ramchandran
AIFin
22
1
0
03 Oct 2024
Eagle: Efficient Training-Free Router for Multi-LLM Inference
Zesen Zhao
Shuowei Jin
Z. Morley Mao
18
3
0
23 Sep 2024
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
50
23
0
10 Sep 2024
Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Sarath Chandar
46
0
0
16 Aug 2024
A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning
Prateek Yadav
Colin Raffel
Mohammed Muqeeth
Lucas Page-Caccia
Haokun Liu
Tianlong Chen
Mohit Bansal
Leshem Choshen
Alessandro Sordoni
MoMe
31
21
0
13 Aug 2024
Arctic-TILT. Business Document Understanding at Sub-Billion Scale
Łukasz Borchmann
Michał Pietruszka
Wojciech Ja'skowski
Dawid Jurkiewicz
Piotr Halama
...
Gabriela Nowakowska
Artur Zawłocki
Łukasz Duhr
Paweł Dyda
Michał Turski
VLM
23
1
0
08 Aug 2024
LLM Inference Serving: Survey of Recent Advances and Opportunities
Baolin Li
Yankai Jiang
V. Gadepally
Devesh Tiwari
64
15
0
17 Jul 2024
1
2
Next