Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.14334
Cited By
Teaching Models to Express Their Uncertainty in Words
28 May 2022
Stephanie C. Lin
Jacob Hilton
Owain Evans
OOD
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Teaching Models to Express Their Uncertainty in Words"
50 / 70 papers shown
Title
Uncertainty Profiles for LLMs: Uncertainty Source Decomposition and Adaptive Model-Metric Selection
Pei-Fu Guo
Yun-Da Tsai
Shou-De Lin
UD
51
0
0
12 May 2025
Adaptive Stress Testing Black-Box LLM Planners
Neeloy Chakraborty
John Pohovey
Melkior Ornik
Katherine Driggs-Campbell
28
0
0
08 May 2025
Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach
Jiancong Xiao
Bojian Hou
Zhanliang Wang
Ruochen Jin
Q. Long
Weijie Su
Li Shen
35
0
0
04 May 2025
Random-Set Large Language Models
Muhammad Mubashar
Shireen Kudukkil Manchingal
Fabio Cuzzolin
66
0
0
25 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
92
0
0
25 Apr 2025
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Ziwei Ji
L. Yu
Yeskendir Koishekenov
Yejin Bang
Anthony Hartshorn
Alan Schelten
Cheng Zhang
Pascale Fung
Nicola Cancedda
53
1
0
18 Mar 2025
Don't lie to your friends: Learning what you know from collaborative self-play
Jacob Eisenstein
Reza Aghajani
Adam Fisch
Dheeru Dua
Fantine Huot
Mirella Lapata
Vicky Zayats
Jonathan Berant
72
0
0
18 Mar 2025
FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models
Hongzhan Lin
Yang Deng
Yuxuan Gu
Wenxuan Zhang
Jing Ma
See-Kiong Ng
Tat-Seng Chua
LLMAG
KELM
HILM
68
0
0
25 Feb 2025
Large Language Model Confidence Estimation via Black-Box Access
Tejaswini Pedapati
Amit Dhurandhar
Soumya Ghosh
Soham Dan
P. Sattigeri
89
3
0
21 Feb 2025
Can ChatGPT Diagnose Alzheimer's Disease?
Quoc Toan Nguyen
Linh Le
Xuan-The Tran
T. Do
Chin-Teng Lin
LM&MA
234
0
0
10 Feb 2025
Is LLM an Overconfident Judge? Unveiling the Capabilities of LLMs in Detecting Offensive Language with Annotation Disagreement
Junyu Lu
Kai Ma
Kaichun Wang
Kelaiti Xiao
Roy Ka-Wei Lee
Bo Xu
Liang Yang
Hongfei Lin
51
0
0
10 Feb 2025
Confidence Elicitation: A New Attack Vector for Large Language Models
Brian Formento
Chuan-Sheng Foo
See-Kiong Ng
AAML
99
0
0
07 Feb 2025
A statistically consistent measure of Semantic Variability using Language Models
Yi Liu
73
0
0
01 Feb 2025
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Haizhou Shi
Yibin Wang
Ligong Han
H. M. Zhang
Hao Wang
UQCV
83
0
0
07 Dec 2024
Are LLM-Judges Robust to Expressions of Uncertainty? Investigating the effect of Epistemic Markers on LLM-based Evaluation
Dongryeol Lee
Yerin Hwang
Yongil Kim
Joonsuk Park
Kyomin Jung
ELM
72
5
0
28 Oct 2024
Do LLMs estimate uncertainty well in instruction-following?
Juyeon Heo
Miao Xiong
Christina Heinze-Deml
Jaya Narain
ELM
52
3
0
18 Oct 2024
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Yiming Wang
Pei Zhang
Baosong Yang
Derek F. Wong
Rui-cang Wang
LRM
50
4
0
17 Oct 2024
On Calibration of LLM-based Guard Models for Reliable Content Moderation
Hongfu Liu
Hengguan Huang
Hao Wang
Xiangming Gu
Ye Wang
55
2
0
14 Oct 2024
Taming Overconfidence in LLMs: Reward Calibration in RLHF
Jixuan Leng
Chengsong Huang
Banghua Zhu
Jiaxin Huang
34
7
0
13 Oct 2024
COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Philipp Guldimann
Alexander Spiridonov
Robin Staab
Nikola Jovanović
Mark Vero
...
Mislav Balunović
Nikola Konstantinov
Pavol Bielik
Petar Tsankov
Martin Vechev
ELM
50
4
0
10 Oct 2024
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs
Ruijia Niu
D. Wu
Rose Yu
Yi Ma
33
1
0
09 Oct 2024
Calibrating Expressions of Certainty
Peiqi Wang
Barbara D. Lam
Yingcheng Liu
Ameneh Asgari-Targhi
Rameswar Panda
W. Wells
Tina Kapur
Polina Golland
34
1
0
06 Oct 2024
Enhancing Healthcare LLM Trust with Atypical Presentations Recalibration
Jeremy Qin
Bang Liu
Quoc Dinh Nguyen
35
2
0
05 Sep 2024
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty
Yongjin Yang
Haneul Yoo
Hwaran Lee
65
1
0
13 Aug 2024
Cost-Effective Hallucination Detection for LLMs
Simon Valentin
Jinmiao Fu
Gianluca Detommaso
Shaoyuan Xu
Giovanni Zappella
Bryan Wang
HILM
42
4
0
31 Jul 2024
PaCoST: Paired Confidence Significance Testing for Benchmark Contamination Detection in Large Language Models
Huixuan Zhang
Yun Lin
Xiaojun Wan
50
0
0
26 Jun 2024
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs
Jannik Kossen
Jiatong Han
Muhammed Razzak
Lisa Schut
Shreshth A. Malik
Yarin Gal
HILM
60
34
0
22 Jun 2024
Teaching LLMs to Abstain across Languages via Multilingual Feedback
Shangbin Feng
Weijia Shi
Yike Wang
Wenxuan Ding
Orevaoghene Ahia
Shuyue Stella Li
Vidhisha Balachandran
Sunayana Sitaram
Yulia Tsvetkov
72
4
0
22 Jun 2024
Counterfactual Debating with Preset Stances for Hallucination Elimination of LLMs
Yi Fang
Moxin Li
Wenjie Wang
Hui Lin
Fuli Feng
LRM
65
5
0
17 Jun 2024
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
Shengkang Wang
Hongzhan Lin
Ziyang Luo
Zhen Ye
Guang Chen
Jing Ma
68
3
0
17 Jun 2024
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Wenkai Yang
Shiqi Shen
Guangyao Shen
Zhi Gong
Yankai Lin
Zhi Gong
Yankai Lin
Ji-Rong Wen
55
13
0
17 Jun 2024
Eliciting Informative Text Evaluations with Large Language Models
Yuxuan Lu
Shengwei Xu
Yichi Zhang
Yuqing Kong
Grant Schoenebeck
36
5
0
23 May 2024
BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models
Yu Feng
Ben Zhou
Weidong Lin
Dan Roth
76
5
0
18 Apr 2024
Confidence Calibration and Rationalization for LLMs via Multi-Agent Deliberation
Ruixin Yang
Dheeraj Rajagopal
S. Hayati
Bin Hu
Dongyeop Kang
LLMAG
43
4
0
14 Apr 2024
Calibrating the Confidence of Large Language Models by Eliciting Fidelity
Mozhi Zhang
Mianqiu Huang
Rundong Shi
Linsen Guo
Chong Peng
Peng Yan
Yaqian Zhou
Xipeng Qiu
22
10
0
03 Apr 2024
Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art
Neeloy Chakraborty
Melkior Ornik
Katherine Driggs-Campbell
LRM
57
9
0
25 Mar 2024
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
56
17
0
28 Feb 2024
Calibrating Large Language Models with Sample Consistency
Qing Lyu
Kumar Shridhar
Chaitanya Malaviya
Li Zhang
Yanai Elazar
Niket Tandon
Marianna Apidianaki
Mrinmaya Sachan
Chris Callison-Burch
43
23
0
21 Feb 2024
Soft Self-Consistency Improves Language Model Agents
Han Wang
Archiki Prasad
Elias Stengel-Eskin
Mohit Bansal
LLMAG
24
7
0
20 Feb 2024
The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems: A Scoping Survey
Dhruv Dhamani
Mary Lou Maher
30
1
0
29 Dec 2023
Robust Knowledge Extraction from Large Language Models using Social Choice Theory
Nico Potyka
Yuqicheng Zhu
Yunjie He
Evgeny Kharlamov
Steffen Staab
24
3
0
22 Dec 2023
Universal Self-Consistency for Large Language Model Generation
Xinyun Chen
Renat Aksitov
Uri Alon
Jie Jessie Ren
Kefan Xiao
Pengcheng Yin
Sushant Prakash
Charles Sutton
Xuezhi Wang
Denny Zhou
LRM
26
66
0
29 Nov 2023
ClimateX: Do LLMs Accurately Assess Human Expert Confidence in Climate Statements?
Romain Lacombe
Kerrie Wu
Eddie Dilworth
40
5
0
28 Nov 2023
Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge
Genglin Liu
Xingyao Wang
Lifan Yuan
Yangyi Chen
Hao Peng
29
16
0
16 Nov 2023
Llamas Know What GPTs Don't Show: Surrogate Models for Confidence Estimation
Vaishnavi Shrivastava
Percy Liang
Ananya Kumar
18
28
0
15 Nov 2023
Quantifying Uncertainty in Natural Language Explanations of Large Language Models
Sree Harsha Tanneru
Chirag Agarwal
Himabindu Lakkaraju
LRM
27
14
0
06 Nov 2023
Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method
Yukun Zhao
Lingyong Yan
Weiwei Sun
Guoliang Xing
Chong Meng
Shuaiqiang Wang
Zhicong Cheng
Zhaochun Ren
Dawei Yin
29
35
0
27 Oct 2023
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search
Yuchen Zhuang
Xiang Chen
Tong Yu
Saayan Mitra
Victor S. Bursztyn
Ryan A. Rossi
Somdeb Sarkhel
Chao Zhang
LLMAG
36
53
0
20 Oct 2023
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models
Zekun Wang
Zhongyuan Peng
Haoran Que
Jiaheng Liu
Wangchunshu Zhou
...
Wanli Ouyang
Ke Xu
Wenhu Chen
Jie Fu
Junran Peng
LLMAG
41
83
0
01 Oct 2023
Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions
Pouya Pezeshkpour
Estevam R. Hruschka
LRM
20
126
0
22 Aug 2023
1
2
Next