Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.05221
Cited By
Language Models (Mostly) Know What They Know
11 July 2022
Saurav Kadavath
Tom Conerly
Amanda Askell
T. Henighan
Dawn Drain
Ethan Perez
Nicholas Schiefer
Zac Hatfield-Dodds
Nova Dassarma
Eli Tran-Johnson
Scott Johnston
S. E. Showk
Andy Jones
Nelson Elhage
Tristan Hume
Anna Chen
Yuntao Bai
Sam Bowman
Stanislav Fort
Deep Ganguli
Danny Hernandez
Josh Jacobson
John Kernion
Shauna Kravec
Liane Lovitt
Kamal Ndousse
Catherine Olsson
Sam Ringer
Dario Amodei
Tom B. Brown
Jack Clark
Nicholas Joseph
Benjamin Mann
Sam McCandlish
C. Olah
Jared Kaplan
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language Models (Mostly) Know What They Know"
50 / 111 papers shown
Title
Uncertainty Profiles for LLMs: Uncertainty Source Decomposition and Adaptive Model-Metric Selection
Pei-Fu Guo
Yun-Da Tsai
Shou-De Lin
UD
36
0
0
12 May 2025
Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach
Jiancong Xiao
Bojian Hou
Zhanliang Wang
Ruochen Jin
Q. Long
Weijie Su
Li Shen
28
0
0
04 May 2025
Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding
Trilok Padhi
R. Kaur
Adam D. Cobb
Manoj Acharya
Anirban Roy
Colin Samplawski
Brian Matejek
Alexander M. Berenbeim
Nathaniel D. Bastian
Susmit Jha
20
0
0
30 Apr 2025
Bi-directional Model Cascading with Proxy Confidence
David Warren
Mark Dras
44
0
0
27 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
85
0
0
25 Apr 2025
Random-Set Large Language Models
Muhammad Mubashar
Shireen Kudukkil Manchingal
Fabio Cuzzolin
61
0
0
25 Apr 2025
CCSK:Cognitive Convection of Self-Knowledge Based Retrieval Augmentation for Large Language Models
Jianling Lu
Mingqi Lv
Tieming Chen
RALM
45
0
0
07 Apr 2025
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Ziwei Ji
L. Yu
Yeskendir Koishekenov
Yejin Bang
Anthony Hartshorn
Alan Schelten
Cheng Zhang
Pascale Fung
Nicola Cancedda
46
1
0
18 Mar 2025
Don't lie to your friends: Learning what you know from collaborative self-play
Jacob Eisenstein
Reza Aghajani
Adam Fisch
Dheeru Dua
Fantine Huot
Mirella Lapata
Vicky Zayats
Jonathan Berant
70
0
0
18 Mar 2025
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling
Hang Zheng
Hongshen Xu
Yuncong Liu
Lu Chen
Pascale Fung
Kai Yu
83
2
0
04 Mar 2025
How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach
Ayeong Lee
Ethan Che
Tianyi Peng
LRM
42
10
0
03 Mar 2025
Towards Efficient Educational Chatbots: Benchmarking RAG Frameworks
Umar Ali Khan
Ekram Khan
Fiza Khan
A. A. Moinuddin
48
0
0
02 Mar 2025
Semantic Volume: Quantifying and Detecting both External and Internal Uncertainty in LLMs
Xiaomin Li
Zhou Yu
Ziji Zhang
Yingying Zhuang
S.
Narayanan Sadagopan
Anurag Beniwal
HILM
58
0
0
28 Feb 2025
END: Early Noise Dropping for Efficient and Effective Context Denoising
Hongye Jin
Pei Chen
Jingfeng Yang
Z. Wang
Meng-Long Jiang
...
X. Zhang
Zheng Li
Tianyi Liu
Huasheng Li
Bing Yin
81
0
0
26 Feb 2025
Monte Carlo Temperature: a robust sampling strategy for LLM's uncertainty quantification methods
Nicola Cecere
Andrea Bacciu
Ignacio Fernández Tobías
Amin Mantrach
66
1
0
25 Feb 2025
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
Viktor Moskvoretskii
M. Lysyuk
Mikhail Salnikov
Nikolay Ivanov
Sergey Pletenev
Daria Galimzianova
Nikita Krayko
Vasily Konovalov
Irina Nikishina
Alexander Panchenko
RALM
74
4
0
24 Feb 2025
Large Language Model Confidence Estimation via Black-Box Access
Tejaswini Pedapati
Amit Dhurandhar
Soumya Ghosh
Soham Dan
P. Sattigeri
89
3
0
21 Feb 2025
A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics
Ting-Ruen Wei
Haowei Liu
Xuyang Wu
Yi Fang
LRM
AI4CE
ReLM
KELM
140
1
0
21 Feb 2025
Hallucination Detection in Large Language Models with Metamorphic Relations
Borui Yang
Md Afif Al Mamun
Jie M. Zhang
Gias Uddin
HILM
59
0
0
20 Feb 2025
SMART: Self-Aware Agent for Tool Overuse Mitigation
Cheng Qian
Emre Can Acikgoz
H. Wang
X. Chen
Avirup Sil
Dilek Hakkani-Tür
Gökhan Tür
Heng Ji
LLMAG
KELM
LRM
63
4
0
17 Feb 2025
Uncertainty-Aware Step-wise Verification with Generative Reward Models
Zihuiwen Ye
L. Melo
Younesse Kaddar
Phil Blunsom
S. Kamath S
Yarin Gal
LRM
44
0
0
16 Feb 2025
Has My System Prompt Been Used? Large Language Model Prompt Membership Inference
Roman Levin
Valeriia Cherepanova
Abhimanyu Hans
Avi Schwarzschild
Tom Goldstein
85
1
0
14 Feb 2025
Cost-Saving LLM Cascades with Early Abstention
Michael J. Zellinger
Rex Liu
Matt Thomson
98
0
0
13 Feb 2025
Can ChatGPT Diagnose Alzheimer's Disease?
Quoc Toan Nguyen
Linh Le
Xuan-The Tran
T. Do
Chin-Teng Lin
LM&MA
163
0
0
10 Feb 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
88
11
0
31 Dec 2024
Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding
Nabeel Seedat
Caterina Tozzi
Andrea Hita Ardiaca
M. Schaar
James Weatherall
Adam Taylor
108
0
0
20 Nov 2024
Prompt-Guided Internal States for Hallucination Detection of Large Language Models
Fujie Zhang
Peiqi Yu
Biao Yi
Baolei Zhang
Tong Li
Zheli Liu
HILM
LRM
50
0
0
07 Nov 2024
Dynamic Strategy Planning for Efficient Question Answering with Large Language Models
Tanmay Parekh
Pradyot Prakash
Alexander Radovic
Akshay Shekher
Denis Savenkov
LRM
51
1
0
30 Oct 2024
Are LLM-Judges Robust to Expressions of Uncertainty? Investigating the effect of Epistemic Markers on LLM-based Evaluation
Dongryeol Lee
Yerin Hwang
Yongil Kim
Joonsuk Park
Kyomin Jung
ELM
70
5
0
28 Oct 2024
ToW: Thoughts of Words Improve Reasoning in Large Language Models
Zhikun Xu
Ming shen
Jacob Dineen
Zhaonan Li
Xiao Ye
Shijie Lu
Aswin Rrv
Chitta Baral
Ben Zhou
LRM
79
1
0
21 Oct 2024
Do LLMs estimate uncertainty well in instruction-following?
Juyeon Heo
Miao Xiong
Christina Heinze-Deml
Jaya Narain
ELM
48
3
0
18 Oct 2024
FIRE: Fact-checking with Iterative Retrieval and Verification
Zhuohan Xie
Rui Xing
Yuxia Wang
Jiahui Geng
Hasan Iqbal
Dhruv Sahnan
Iryna Gurevych
Preslav Nakov
HILM
50
2
0
17 Oct 2024
Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Michael J.Q. Zhang
W. Bradley Knox
Eunsol Choi
48
3
0
17 Oct 2024
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Yiming Wang
Pei Zhang
Baosong Yang
Derek F. Wong
Rui-cang Wang
LRM
40
4
0
17 Oct 2024
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
ZhongXiang Sun
Xiaoxue Zang
Kai Zheng
Yang Song
Jun Xu
Xiao Zhang
Weijie Yu
Yang Song
Han Li
55
7
0
15 Oct 2024
On Calibration of LLM-based Guard Models for Reliable Content Moderation
Hongfu Liu
Hengguan Huang
Hao Wang
Xiangming Gu
Ye Wang
53
2
0
14 Oct 2024
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Enyu Zhou
Guodong Zheng
B. Wang
Zhiheng Xi
Shihan Dou
...
Yurong Mou
Rui Zheng
Tao Gui
Qi Zhang
Xuanjing Huang
ALM
54
14
0
13 Oct 2024
Frame-Voyager: Learning to Query Frames for Video Large Language Models
Sicheng Yu
Chengkai Jin
Huanyu Wang
Zhenghao Chen
Sheng Jin
...
Zhenbang Sun
Bingni Zhang
Jiawei Wu
Hao Zhang
Qianru Sun
67
5
0
04 Oct 2024
Can Language Model Understand Word Semantics as A Chatbot? An Empirical Study of Language Model Internal External Mismatch
Jinman Zhao
Xueyan Zhang
Xingyu Yue
Weizhe Chen
Zifan Qian
Ruiyu Wang
LRM
16
0
0
21 Sep 2024
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty
Yongjin Yang
Haneul Yoo
Hwaran Lee
60
1
0
13 Aug 2024
Cost-Effective Hallucination Detection for LLMs
Simon Valentin
Jinmiao Fu
Gianluca Detommaso
Shaoyuan Xu
Giovanni Zappella
Bryan Wang
HILM
33
4
0
31 Jul 2024
Correcting Negative Bias in Large Language Models through Negative Attention Score Alignment
Sangwon Yu
Jongyoon Song
Bongkyu Hwang
Hoyoung Kang
Sooah Cho
Junhwa Choi
Seongho Joe
Taehee Lee
Youngjune Gwon
Sungroh Yoon
90
4
0
31 Jul 2024
Automated Review Generation Method Based on Large Language Models
Shican Wu
Xiao Ma
Dehui Luo
Lulu Li
Xiangcheng Shi
...
Ran Luo
Chunlei Pei
Zhijian Zhao
Zhi-Jian Zhao
Jinlong Gong
69
0
0
30 Jul 2024
Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost
Sania Nayab
Giulio Rossolini
Giorgio Buttazzo
Nicolamaria Manes
F. Giacomelli
Nicolamaria Manes
Fabrizio Giacomelli
LRM
43
23
0
29 Jul 2024
From Loops to Oops: Fallback Behaviors of Language Models Under Uncertainty
Maor Ivgi
Ori Yoran
Jonathan Berant
Mor Geva
HILM
47
8
0
08 Jul 2024
Large Language Models Are Involuntary Truth-Tellers: Exploiting Fallacy Failure for Jailbreak Attacks
Yue Zhou
Henry Peng Zou
Barbara Maria Di Eugenio
Yang Zhang
HILM
LRM
37
1
0
01 Jul 2024
PaCoST: Paired Confidence Significance Testing for Benchmark Contamination Detection in Large Language Models
Huixuan Zhang
Yun Lin
Xiaojun Wan
40
0
0
26 Jun 2024
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs
Jannik Kossen
Jiatong Han
Muhammed Razzak
Lisa Schut
Shreshth A. Malik
Yarin Gal
HILM
46
33
0
22 Jun 2024
Teaching LLMs to Abstain across Languages via Multilingual Feedback
Shangbin Feng
Weijia Shi
Yike Wang
Wenxuan Ding
Orevaoghene Ahia
Shuyue Stella Li
Vidhisha Balachandran
Sunayana Sitaram
Yulia Tsvetkov
65
4
0
22 Jun 2024
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph
Roman Vashurin
Ekaterina Fadeeva
Artem Vazhentsev
Akim Tsvigun
Daniil Vasilev
...
Timothy Baldwin
Timothy Baldwin
Maxim Panov
Artem Shelmanov
Artem Shelmanov
HILM
64
8
0
21 Jun 2024
1
2
3
Next