ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.11021
51
1

Leveraging Uncertainty Estimation for Efficient LLM Routing

16 February 2025
Tuo Zhang
Asal Mehradfar
Dimitrios Dimitriadis
Salman Avestimehr
ArXivPDFHTML
Abstract

Deploying large language models (LLMs) in edge-cloud environments requires an efficient routing strategy to balance cost and response quality. Traditional approaches prioritize either human-preference data or accuracy metrics from benchmark datasets as routing criteria, but these methods suffer from rigidity and subjectivity. Moreover, existing routing frameworks primarily focus on accuracy and cost, neglecting response quality from a human preference perspective. In this work, we propose the Confidence-Driven LLM Router, a novel framework that leverages uncertainty estimation to optimize routing decisions. To comprehensively assess routing performance, we evaluate both system cost efficiency and response quality. In particular, we introduce the novel use of LLM-as-a-Judge to simulate human rating preferences, providing the first systematic assessment of response quality across different routing strategies. Extensive experiments on MT-Bench, GSM8K, and MMLU demonstrate that our approach outperforms state-of-the-art routing methods, achieving superior response quality while maintaining cost efficiency.

View on arXiv
@article{zhang2025_2502.11021,
  title={ Leveraging Uncertainty Estimation for Efficient LLM Routing },
  author={ Tuo Zhang and Asal Mehradfar and Dimitrios Dimitriadis and Salman Avestimehr },
  journal={arXiv preprint arXiv:2502.11021},
  year={ 2025 }
}
Comments on this paper