Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.14975
Cited By
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
24 May 2023
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback"
28 / 228 papers shown
Title
Towards A Unified View of Answer Calibration for Multi-Step Reasoning
Shumin Deng
Ningyu Zhang
Nay Oo
Bryan Hooi
LRM
34
1
0
15 Nov 2023
Llamas Know What GPTs Don't Show: Surrogate Models for Confidence Estimation
Vaishnavi Shrivastava
Percy Liang
Ananya Kumar
13
28
0
15 Nov 2023
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling
Bairu Hou
Yujian Liu
Kaizhi Qian
Jacob Andreas
Shiyu Chang
Yang Zhang
UD
UQCV
PER
16
48
0
15 Nov 2023
Fine-tuning Language Models for Factuality
Katherine Tian
Eric Mitchell
Huaxiu Yao
Christopher D. Manning
Chelsea Finn
KELM
HILM
SyDa
17
166
0
14 Nov 2023
A Survey of Confidence Estimation and Calibration in Large Language Models
Jiahui Geng
Fengyu Cai
Yuxia Wang
Heinz Koeppl
Preslav Nakov
Iryna Gurevych
UQCV
41
54
0
14 Nov 2023
Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning
Yue Yu
Jiaming Shen
Tianqi Liu
Zhen Qin
Jing Nathan Yan
Jialu Liu
Chao Zhang
Michael Bendersky
44
6
0
13 Nov 2023
SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency
Jiaxin Zhang
Zhuohang Li
Kamalika Das
Bradley Malin
Kumar Sricharan
HILM
LRM
24
56
0
03 Nov 2023
LitCab: Lightweight Language Model Calibration over Short- and Long-form Responses
Xin Liu
Muhammad Khalifa
Lu Wang
ALM
23
18
0
30 Oct 2023
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Eric Mitchell
Rafael Rafailov
Archit Sharma
Chelsea Finn
Christopher D. Manning
ALM
33
51
0
19 Oct 2023
Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting
Guande He
Peng Cui
Jianfei Chen
Wenbo Hu
Jun Zhu
47
11
0
18 Oct 2023
An Empirical Study of Translation Hypothesis Ensembling with Large Language Models
António Farinhas
José G. C. de Souza
André F. T. Martins
23
8
0
17 Oct 2023
How (not) to ensemble LVLMs for VQA
Lisa Alazraki
Lluis Castrejon
Mostafa Dehghani
Fantine Huot
J. Uijlings
Thomas Mensink
27
3
0
10 Oct 2023
Label-free Node Classification on Graphs with Large Language Models (LLMS)
Zhikai Chen
Haitao Mao
Hongzhi Wen
Haoyu Han
Wei-dong Jin
Haiyang Zhang
Hui Liu
Jiliang Tang
28
72
0
07 Oct 2023
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs
Justin Chih-Yao Chen
Swarnadeep Saha
Mohit Bansal
LLMAG
LRM
35
119
0
22 Sep 2023
PACE-LM: Prompting and Augmentation for Calibrated Confidence Estimation with GPT-4 in Cloud Incident Root Cause Analysis
Dylan Zhang
Xuchao Zhang
Chetan Bansal
P. Las-Casas
Rodrigo Fonseca
Saravan Rajmohan
38
1
0
11 Sep 2023
Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness
Jiuhai Chen
Jonas W. Mueller
42
55
0
30 Aug 2023
Evaluation and Analysis of Hallucination in Large Vision-Language Models
Junyan Wang
Yi Zhou
Guohai Xu
Pengcheng Shi
Chenlin Zhao
...
Mingshi Yan
Ji Zhang
Jihua Zhu
Jitao Sang
Haoyu Tang
MLLM
21
65
0
29 Aug 2023
Bayesian Low-rank Adaptation for Large Language Models
Adam X. Yang
Maxime Robeyns
Xi Wang
Laurence Aitchison
AI4CE
BDL
16
44
0
24 Aug 2023
Diversity Measures: Domain-Independent Proxies for Failure in Language Model Queries
Noel Ngu
Nathaniel Lee
Paulo Shakarian
16
4
0
22 Aug 2023
A Survey on Evaluation of Large Language Models
Yu-Chu Chang
Xu Wang
Jindong Wang
Yuanyi Wu
Linyi Yang
...
Yue Zhang
Yi-Ju Chang
Philip S. Yu
Qian Yang
Xingxu Xie
ELM
LM&MA
ALM
58
1,510
0
06 Jul 2023
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
Miao Xiong
Zhiyuan Hu
Xinyang Lu
Yifei Li
Jie Fu
Junxian He
Bryan Hooi
28
369
0
22 Jun 2023
When to Read Documents or QA History: On Unified and Selective Open-domain QA
Kyungjae Lee
Sanghyun Han
Seung-won Hwang
Moontae Lee
RALM
14
4
0
07 Jun 2023
RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning
Alexander Scarlatos
Andrew S. Lan
OffRL
LRM
21
20
0
23 May 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
280
3,000
0
22 Mar 2023
Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis
Yuxin Xiao
Paul Pu Liang
Umang Bhatt
W. Neiswanger
Ruslan Salakhutdinov
Louis-Philippe Morency
175
86
0
10 Oct 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
308
11,915
0
04 Mar 2022
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
209
153
0
30 Dec 2020
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,587
0
18 Sep 2019
Previous
1
2
3
4
5