Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.14975
Cited By
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
24 May 2023
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback"
50 / 228 papers shown
Title
Variational Visual Question Answering
Tobias Jan Wieczorek
Nathalie Daun
Mohammad Emtiyaz Khan
Marcus Rohrbach
OOD
21
0
0
14 May 2025
A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs
Artem Shelmanov
Ekaterina Fadeeva
Akim Tsvigun
Ivan Tsvigun
Zhuohan Xie
...
Caiqi Zhang
Artem Vazhentsev
Mrinmaya Sachan
Preslav Nakov
Timothy Baldwin
HILM
38
0
0
13 May 2025
Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach
Jiancong Xiao
Bojian Hou
Zhanliang Wang
Ruochen Jin
Q. Long
Weijie Su
Li Shen
30
0
0
04 May 2025
Always Tell Me The Odds: Fine-grained Conditional Probability Estimation
Liaoyaqi Wang
Zhengping Jiang
Anqi Liu
Benjamin Van Durme
57
0
0
02 May 2025
Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding
Trilok Padhi
R. Kaur
Adam D. Cobb
Manoj Acharya
Anirban Roy
Colin Samplawski
Brian Matejek
Alexander M. Berenbeim
Nathaniel D. Bastian
Susmit Jha
22
0
0
30 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
87
0
0
25 Apr 2025
Lightweight Latent Verifiers for Efficient Meta-Generation Strategies
Bartosz Piotrowski
Witold Drzewakowski
Konrad Staniszewski
Piotr Miłoś
LRM
36
0
0
23 Apr 2025
Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation
Zhiyuan Hu
Shiyun Xiong
Yifan Zhang
See-Kiong Ng
Anh Tuan Luu
Bo An
Shuicheng Yan
Bryan Hooi
35
0
0
22 Apr 2025
Exploring the Potential for Large Language Models to Demonstrate Rational Probabilistic Beliefs
Gabriel Freedman
Francesca Toni
LRM
33
0
0
18 Apr 2025
Identifying and Mitigating the Influence of the Prior Distribution in Large Language Models
Liyi Zhang
Veniamin Veselovsky
R. Thomas McCoy
Thomas L. Griffiths
52
0
0
17 Apr 2025
Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data
Shuai Zhao
Linchao Zhu
Yi Yang
37
1
0
14 Apr 2025
Do Reasoning Models Show Better Verbalized Calibration?
Qingcheng Zeng
Weihao Xuan
Leyang Cui
Rob Voigt
LRM
28
0
0
09 Apr 2025
Reasoning Models Know When They're Right: Probing Hidden States for Self-Verification
Anqi Zhang
Yulin Chen
Jane Pan
Chen Zhao
Aurojit Panda
Jinyang Li
He He
ReLM
LRM
36
2
0
07 Apr 2025
Bonsai: Interpretable Tree-Adaptive Grounded Reasoning
Kate Sanders
Benjamin Van Durme
LRM
34
1
0
04 Apr 2025
Locations of Characters in Narratives: Andersen and Persuasion Datasets
Batuhan Ozyurt
Roya Arkhmammadova
Deniz Yuret
29
0
0
04 Apr 2025
How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence
Hongzhe Du
Weikai Li
Min Cai
Karim Saraipour
Zimin Zhang
Himabindu Lakkaraju
Yizhou Sun
Shichang Zhang
KELM
51
0
0
03 Apr 2025
Language Model Uncertainty Quantification with Attention Chain
Yinghao Li
Rushi Qiang
Lama Moukheiber
Chao Zhang
LRM
46
0
0
24 Mar 2025
Uncertainty Quantification and Confidence Calibration in Large Language Models: A Survey
Xiaoou Liu
Tiejin Chen
Longchao Da
Chacha Chen
Zhen Lin
Hua Wei
HILM
62
3
0
20 Mar 2025
Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence
Sophia Hager
David Mueller
Kevin Duh
Nicholas Andrews
65
0
0
18 Mar 2025
Investigating Human-Aligned Large Language Model Uncertainty
Kyle Moore
Jesse Roberts
Daryl Watson
Pamela Wisniewski
51
0
0
16 Mar 2025
Modeling Subjectivity in Cognitive Appraisal with Language Models
Yuxiang Zhou
Hainiu Xu
Desmond C. Ong
Petr Slovak
Yulan He
39
0
0
14 Mar 2025
Uncertainty in Action: Confidence Elicitation in Embodied Agents
Tianjiao Yu
Vedant Shah
Muntasir Wahed
Kiet A. Nguyen
Adheesh Sunil Juvekar
Tal August
Ismini Lourentzou
40
0
0
13 Mar 2025
SCE: Scalable Consistency Ensembles Make Blackbox Large Language Model Generation More Reliable
Jiaxin Zhang
Z. Li
Wendi Cui
Kamalika Das
Bradley Malin
Sricharan Kumar
41
0
0
13 Mar 2025
Taxonomic Reasoning for Rare Arthropods: Combining Dense Image Captioning and RAG for Interpretable Classification
Nathaniel Lesperance
S. Ratnasingham
Graham W. Taylor
VLM
74
0
0
13 Mar 2025
Delusions of Large Language Models
Hongshen Xu
Zixv yang
Zichen Zhu
Kunyao Lan
Zihan Wang
Mengyue Wu
Ziwei Ji
L. Chen
Pascale Fung
Kai Yu
LRM
HILM
47
0
0
09 Mar 2025
Alignment for Efficient Tool Calling of Large Language Models
Hongshen Xu
Zihan Wang
Zichen Zhu
Lei Pan
Xingyu Chen
L. Chen
Kai Yu
47
0
0
09 Mar 2025
Calibrating LLM Confidence with Semantic Steering: A Multi-Prompt Aggregation Framework
Ziang Zhou
Tianyuan Jin
Jieming Shi
Qing Li
LLMSV
68
0
0
04 Mar 2025
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling
Hang Zheng
Hongshen Xu
Yuncong Liu
Lu Chen
Pascale Fung
Kai Yu
83
2
0
04 Mar 2025
An Efficient Plugin Method for Metric Optimization of Black-Box Models
Siddartha Devic
Nurendra Choudhary
Anirudh Srinivasan
Sahika Genc
B. Kveton
G. Hiranandani
39
0
0
03 Mar 2025
Answer, Refuse, or Guess? Investigating Risk-Aware Decision Making in Language Models
Cheng-Kuang Wu
Zhi Rui Tam
Chieh-Yen Lin
Yun-Nung Chen
Hung-yi Lee
62
0
0
03 Mar 2025
A Survey of Uncertainty Estimation Methods on Large Language Models
Zhiqiu Xia
Jinxuan Xu
Yuqian Zhang
Hang Liu
38
1
0
28 Feb 2025
Conformal Linguistic Calibration: Trading-off between Factuality and Specificity
Zhengping Jiang
Anqi Liu
Benjamin Van Durme
84
1
0
26 Feb 2025
Uncertainty Quantification in Retrieval Augmented Question Answering
Laura Perez-Beltrachini
Mirella Lapata
RALM
43
0
0
25 Feb 2025
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
Boxuan Zhang
Ruqi Zhang
LRM
30
1
0
24 Feb 2025
Text-to-SQL Domain Adaptation via Human-LLM Collaborative Data Annotation
Yuan Tian
Daniel Lee
Fei Wu
Tung Mai
Kun Qian
Siddhartha Sahai
Tianyi Zhang
Yunyao Li
SyDa
43
0
0
21 Feb 2025
CER: Confidence Enhanced Reasoning in LLMs
Ali Razghandi
Seyed Mohsen Hosseini
Mahdieh Soleymani Baghshah
LRM
98
2
0
21 Feb 2025
Large Language Model Confidence Estimation via Black-Box Access
Tejaswini Pedapati
Amit Dhurandhar
Soumya Ghosh
Soham Dan
P. Sattigeri
89
3
0
21 Feb 2025
Verify when Uncertain: Beyond Self-Consistency in Black Box Hallucination Detection
Yihao Xue
Kristjan Greenewald
Youssef Mroueh
Baharan Mirzasoleiman
HILM
46
1
0
20 Feb 2025
Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception
Shiyu Ni
Keping Bi
J. Guo
Lulu Yu
Baolong Bi
Xueqi Cheng
51
2
0
17 Feb 2025
An Empirical Analysis of Uncertainty in Large Language Model Evaluations
Qiujie Xie
Qingqiu Li
Zhuohao Yu
Yuejie Zhang
Yue Zhang
Linyi Yang
ELM
58
1
0
15 Feb 2025
AI Alignment at Your Discretion
Maarten Buyl
Hadi Khalaf
C. M. Verdun
Lucas Monteiro Paes
Caio Vieira Machado
Flavio du Pin Calmon
40
0
0
10 Feb 2025
Unveiling the Capabilities of Large Language Models in Detecting Offensive Language with Annotation Disagreement
Junyu Lu
Kai Ma
Kaichun Wang
Kelaiti Xiao
Roy Ka-Wei Lee
Bo Xu
Liang Yang
Hongfei Lin
44
0
0
10 Feb 2025
Confidence Elicitation: A New Attack Vector for Large Language Models
Brian Formento
Chuan-Sheng Foo
See-Kiong Ng
AAML
94
0
0
07 Feb 2025
Understanding the Capabilities and Limitations of Weak-to-Strong Generalization
Wei Yao
Wenkai Yang
Z. Wang
Yankai Lin
Yong Liu
ELM
97
1
0
03 Feb 2025
What is a Number, That a Large Language Model May Know It?
Raja Marjieh
Veniamin Veselovsky
Thomas L. Griffiths
Ilia Sucholutsky
123
2
0
03 Feb 2025
ReFoRCE: A Text-to-SQL Agent with Self-Refinement, Format Restriction, and Column Exploration
Minghang Deng
Ashwin Ramachandran
Canwen Xu
Lanxiang Hu
Zhewei Yao
Anupam Datta
Hao Zhang
LMTD
121
1
0
02 Feb 2025
BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
Yibin Wang
H. Shi
Ligong Han
Dimitris N. Metaxas
Hao Wang
BDL
UQLM
104
6
0
28 Jan 2025
Exploring Multi-Modal Integration with Tool-Augmented LLM Agents for Precise Causal Discovery
ChengAo Shen
Z. Chen
Dongsheng Luo
Dongkuan Xu
Haifeng Chen
Jingchao Ni
83
3
0
18 Dec 2024
A Survey of Calibration Process for Black-Box LLMs
Liangru Xie
Hui Liu
Jingying Zeng
Xianfeng Tang
Yan Han
Chen Luo
Jing Huang
Zhen Li
Suhang Wang
Qi He
74
1
0
17 Dec 2024
UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models
Boyang Xue
Fei Mi
Qi Zhu
Hongru Wang
Rui Wang
Sheng Wang
Erxin Yu
Xuming Hu
Kam-Fai Wong
HILM
71
0
0
16 Dec 2024
1
2
3
4
5
Next