Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.10136
Cited By
Language Model Cascades: Token-level uncertainty and beyond
15 April 2024
Neha Gupta
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
A. Menon
Sanjiv Kumar
UQLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language Model Cascades: Token-level uncertainty and beyond"
36 / 36 papers shown
Title
Uncertainty Quantification for Machine Learning in Healthcare: A Survey
L. J. L. Lopez
Shaza Elsharief
Dhiyaa Al Jorf
Firas Darwish
Congbo Ma
Farah E. Shamout
26
0
0
04 May 2025
Bi-directional Model Cascading with Proxy Confidence
David Warren
Mark Dras
44
0
0
27 Apr 2025
Resource-efficient Inference with Foundation Model Programs
Lunyiu Nie
Zhimin Ding
Kevin Yu
Marco Cheung
C. Jermaine
S. Chaudhuri
24
0
0
09 Apr 2025
Token-Level Uncertainty-Aware Objective for Language Model Post-Training
Tingkai Liu
Ari S. Benjamin
Anthony M. Zador
31
0
0
15 Mar 2025
I Know What I Don't Know: Improving Model Cascades Through Confidence Tuning
Stephan Rabanser
Nathalie Rauschmayr
Achin Kulshrestha
Petra Poklukar
Wittawat Jitkrittum
Sean Augenstein
Congchao Wang
Federico Tombari
42
0
0
26 Feb 2025
Harnessing Multiple Large Language Models: A Survey on LLM Ensemble
Zhijun Chen
Jingzheng Li
Pengpeng Chen
Zhuoran Li
Kai Sun
Yuankai Luo
Qianren Mao
Dingqi Yang
Hailong Sun
Philip S. Yu
ELM
50
2
0
25 Feb 2025
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
Boxuan Zhang
Ruqi Zhang
LRM
30
1
0
24 Feb 2025
Translate Smart, not Hard: Cascaded Translation Systems with Quality-Aware Deferral
António Farinhas
Nuno M. Guerreiro
Sweta Agrawal
Ricardo Rei
André F. T. Martins
48
0
0
18 Feb 2025
A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck
Maximilian Baader
Martin Vechev
60
2
0
17 Feb 2025
An Empirical Analysis of Uncertainty in Large Language Model Evaluations
Qiujie Xie
Qingqiu Li
Zhuohao Yu
Yuejie Zhang
Yue Zhang
Linyi Yang
ELM
58
1
0
15 Feb 2025
Division-of-Thoughts: Harnessing Hybrid Language Model Synergy for Efficient On-Device Agents
Chenyang Shao
Xinyuan Hu
Yutang Lin
Fengli Xu
LLMAG
LRM
57
3
0
06 Feb 2025
BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
Yibin Wang
H. Shi
Ligong Han
Dimitris N. Metaxas
Hao Wang
BDL
UQLM
99
6
0
28 Jan 2025
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
H. Shi
Yibin Wang
Ligong Han
H. M. Zhang
Hao Wang
UQCV
83
0
0
07 Dec 2024
A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Z. Li
Yibing Song
K. Wang
Zhangyang Wang
Yang You
77
7
0
04 Dec 2024
DiffServe: Efficiently Serving Text-to-Image Diffusion Models with Query-Aware Model Scaling
Sohaib Ahmad
Qizheng Yang
Haoliang Wang
Ramesh K. Sitaraman
Hui Guan
68
1
0
22 Nov 2024
Interacting Large Language Model Agents. Interpretable Models and Social Learning
Adit Jain
Vikram Krishnamurthy
LLMAG
28
0
0
02 Nov 2024
A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
A. S. Rawat
Veeranjaneyulu Sadhanala
Afshin Rostamizadeh
Ayan Chakrabarti
Wittawat Jitkrittum
...
Rakesh Shivanna
Sashank J. Reddi
A. Menon
Rohan Anil
Sanjiv Kumar
28
2
0
24 Oct 2024
UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation
Zixuan Li
Jing Xiong
Fanghua Ye
Chuanyang Zheng
Xun Wu
...
Xiaodan Liang
Chengming Li
Zhenan Sun
Lingpeng Kong
Ngai Wong
RALM
UQLM
27
2
0
03 Oct 2024
Efficiently Deploying LLMs with Controlled Risk
Michael J. Zellinger
Matt Thomson
36
1
0
03 Oct 2024
RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models
Shuhao Chen
Weisen Jiang
Baijiong Lin
James T. Kwok
Yu Zhang
RALM
MQ
40
5
0
30 Sep 2024
A Survey on the Honesty of Large Language Models
Siheng Li
Cheng Yang
Taiqiang Wu
Chufan Shi
Yuji Zhang
...
Jie Zhou
Yujiu Yang
Ngai Wong
Xixin Wu
Wai Lam
HILM
27
4
0
27 Sep 2024
Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement
Jaehun Jung
Faeze Brahman
Yejin Choi
ALM
40
11
0
25 Jul 2024
CascadeServe: Unlocking Model Cascades for Inference Serving
Ferdi Kossmann
Ziniu Wu
Alex Turk
Nesime Tatbul
Lei Cao
Samuel Madden
34
2
0
20 Jun 2024
On Subjective Uncertainty Quantification and Calibration in Natural Language Generation
Ziyu Wang
Chris Holmes
UQLM
42
4
0
07 Jun 2024
Cascade-Aware Training of Language Models
Congchao Wang
Sean Augenstein
Keith Rush
Wittawat Jitkrittum
Harikrishna Narasimhan
A. S. Rawat
A. Menon
Alec Go
25
4
0
29 May 2024
Faster Cascades via Speculative Decoding
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
Seungyeon Kim
Neha Gupta
A. Menon
Sanjiv Kumar
LRM
44
6
0
29 May 2024
Cost-Effective Online Multi-LLM Selection with Versatile Reward Models
Xiangxiang Dai
Jin Li
Xutong Liu
Anqi Yu
J. C. Lui
41
5
0
26 May 2024
DiaHalu: A Dialogue-level Hallucination Evaluation Benchmark for Large Language Models
Kedi Chen
Qin Chen
Jie Zhou
Yishen He
Liang He
HILM
35
1
0
01 Mar 2024
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking
Zijun Liu
Boqun Kou
Peng Li
Ming Yan
Ji Zhang
Fei Huang
Yang Janet Liu
24
2
0
19 Feb 2024
Online Cascade Learning for Efficient Inference over Streams
Lunyiu Nie
Zhimin Ding
Erdong Hu
Christopher M. Jermaine
Swarat Chaudhuri
29
4
0
07 Feb 2024
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Percy Liang
Christopher Ré
Ion Stoica
Ce Zhang
144
365
0
13 Mar 2023
Out-of-Distribution Detection and Selective Generation for Conditional Language Models
Jie Jessie Ren
Jiaming Luo
Yao-Min Zhao
Kundan Krishna
Mohammad Saleh
Balaji Lakshminarayanan
Peter J. Liu
OODD
64
92
0
30 Sep 2022
On the Relation between Sensitivity and Accuracy in In-context Learning
Yanda Chen
Chen Zhao
Zhou Yu
Kathleen McKeown
He He
180
77
0
16 Sep 2022
BabyBear: Cheap inference triage for expensive language models
Leila Khalili
Yao You
John Bohannon
28
9
0
24 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
197
152
0
30 Dec 2020
1