Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.14618
Cited By
Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing
22 April 2024
Dujian Ding
Ankur Mallick
Chi Wang
Robert Sim
Subhabrata Mukherjee
Victor Rühle
L. Lakshmanan
Ahmed Hassan Awadallah
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing"
18 / 18 papers shown
Title
Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering
Jihao Zhao
Chunlai Zhou
Biao Qin
45
0
0
05 May 2025
UserCentrix: An Agentic Memory-augmented AI Framework for Smart Spaces
Alaa Saleh
Sasu Tarkoma
Praveen Kumar Donta
Naser Hossein Motlagh
Schahram Dustdar
Susanna Pirttikangas
Lauri Lovén
39
0
0
01 May 2025
Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing
Yi-Kai Zhang
De-Chuan Zhan
Han-Jia Ye
ALM
ELM
LRM
29
1
0
24 Feb 2025
A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck
Maximilian Baader
Martin Vechev
60
2
0
17 Feb 2025
Towards Optimizing SQL Generation via LLM Routing
Mohammadhossein Malekpour
Nour Shaheen
Foutse Khomh
Amine Mhedhbi
AI4TS
31
2
0
06 Nov 2024
Towards Edge General Intelligence via Large Language Models: Opportunities and Challenges
Handi Chen
Weipeng Deng
Shuo Yang
J. Xu
Zhihan Jiang
Edith Ngai
Jiangchuan Liu
Xue Liu
ELM
14
1
0
16 Oct 2024
Mentor-KD: Making Small Language Models Better Multi-step Reasoners
Hojae Lee
Junho Kim
SangKeun Lee
LRM
16
1
0
11 Oct 2024
GraphRouter: A Graph-based Router for LLM Selections
Tao Feng
Yanzhen Shen
Jiaxuan You
35
9
0
04 Oct 2024
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
48
23
0
10 Sep 2024
HyPA-RAG: A Hybrid Parameter Adaptive Retrieval-Augmented Generation System for AI Legal and Policy Applications
Rishi Kalra
Zekun Wu
Ayesha Gulley
Airlie Hilliard
Xin Guan
Adriano Soares Koshiyama
Philip C. Treleaven
RALM
AILaw
31
5
0
29 Aug 2024
SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models
Kaushal Kumar Maurya
KV Aditya Srivatsa
Ekaterina Kochmar
34
2
0
16 Aug 2024
MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs
Quang H. Nguyen
Duy C. Hoang
Juliette Decugis
Saurav Manchanda
Nitesh V. Chawla
Khoa D. Doan
Khoa D. Doan
35
6
0
15 Jul 2024
Mobile Edge Intelligence for Large Language Models: A Contemporary Survey
Guanqiao Qu
Qiyuan Chen
Wei Wei
Zheng Lin
Xianhao Chen
Kaibin Huang
27
37
0
09 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
16
17
0
08 Jul 2024
RouteLLM: Learning to Route LLMs with Preference Data
Isaac Ong
Amjad Almahairi
Vincent Wu
Wei-Lin Chiang
Tianhao Wu
Joseph E. Gonzalez
M. W. Kadous
Ion Stoica
56
66
0
26 Jun 2024
Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection
Guillem Ramírez
Alexandra Birch
Ivan Titov
25
8
0
03 May 2024
AutoMix: Automatically Mixing Language Models
Pranjal Aggarwal
Aman Madaan
Ankit Anand
Srividya Pranavi Potharaju
Swaroop Mishra
...
Karthik Kappaganthu
Yiming Yang
Shyam Upadhyay
Manaal Faruqui
Mausam
32
17
0
19 Oct 2023
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
264
5,290
0
05 Nov 2016
1