ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.08298
  4. Cited By
A Survey of Confidence Estimation and Calibration in Large Language
  Models

A Survey of Confidence Estimation and Calibration in Large Language Models

14 November 2023
Jiahui Geng
Fengyu Cai
Yuxia Wang
Heinz Koeppl
Preslav Nakov
Iryna Gurevych
    UQCV
ArXivPDFHTML

Papers citing "A Survey of Confidence Estimation and Calibration in Large Language Models"

50 / 55 papers shown
Title
Why Uncertainty Estimation Methods Fall Short in RAG: An Axiomatic Analysis
Why Uncertainty Estimation Methods Fall Short in RAG: An Axiomatic Analysis
Heydar Soudani
Evangelos Kanoulas
Faegheh Hasibi
16
0
0
12 May 2025
From Knowledge to Reasoning: Evaluating LLMs for Ionic Liquids Research in Chemical and Biological Engineering
From Knowledge to Reasoning: Evaluating LLMs for Ionic Liquids Research in Chemical and Biological Engineering
Gaurab Sarkar
Sougata Saha
16
0
0
11 May 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
85
0
0
25 Apr 2025
Object-Level Verbalized Confidence Calibration in Vision-Language Models via Semantic Perturbation
Object-Level Verbalized Confidence Calibration in Vision-Language Models via Semantic Perturbation
Yunpu Zhao
Rui Zhang
Junbin Xiao
Ruibo Hou
Jiaming Guo
Zihao Zhang
Yifan Hao
Yunji Chen
25
0
0
21 Apr 2025
Gauging Overprecision in LLMs: An Empirical Study
Gauging Overprecision in LLMs: An Empirical Study
Adil Bahaj
Hamed Rahimi
Mohamed Chetouani
Mounir Ghogho
61
0
0
16 Apr 2025
Beyond Accuracy: The Role of Calibration in Self-Improving Large Language Models
Beyond Accuracy: The Role of Calibration in Self-Improving Large Language Models
Liangjie Huang
Dawei Li
Huan Liu
Lu Cheng
LRM
34
0
0
03 Apr 2025
Accelerating Causal Network Discovery of Alzheimer Disease Biomarkers via Scientific Literature-based Retrieval Augmented Generation
Accelerating Causal Network Discovery of Alzheimer Disease Biomarkers via Scientific Literature-based Retrieval Augmented Generation
Xiaofan Zhou
Liangjie Huang
Pinyang Cheng
Wenpen Yin
Rui Zhang
Wenrui Hao
Lu Cheng
21
0
0
01 Apr 2025
Rewarding Doubt: A Reinforcement Learning Approach to Confidence Calibration of Large Language Models
Paul Stangel
D. Bani-Harouni
Chantal Pellegrini
Ege Ozsoy
Kamilia Zaripova
Matthias Keicher
Nassir Navab
29
1
0
04 Mar 2025
Your Model is Overconfident, and Other Lies We Tell Ourselves
Timothee Mickus
Aman Sinha
Raúl Vázquez
48
0
0
03 Mar 2025
An Efficient Plugin Method for Metric Optimization of Black-Box Models
Siddartha Devic
Nurendra Choudhary
Anirudh Srinivasan
Sahika Genc
B. Kveton
G. Hiranandani
36
0
0
03 Mar 2025
A Survey of Uncertainty Estimation Methods on Large Language Models
Zhiqiu Xia
Jinxuan Xu
Yuqian Zhang
Hang Liu
33
1
0
28 Feb 2025
Mind the Confidence Gap: Overconfidence, Calibration, and Distractor Effects in Large Language Models
Mind the Confidence Gap: Overconfidence, Calibration, and Distractor Effects in Large Language Models
Prateek Chhikara
34
1
0
16 Feb 2025
An Empirical Analysis of Uncertainty in Large Language Model Evaluations
An Empirical Analysis of Uncertainty in Large Language Model Evaluations
Qiujie Xie
Qingqiu Li
Zhuohao Yu
Yuejie Zhang
Yue Zhang
Linyi Yang
ELM
58
1
0
15 Feb 2025
Understanding the Capabilities and Limitations of Weak-to-Strong Generalization
Understanding the Capabilities and Limitations of Weak-to-Strong Generalization
Wei Yao
Wenkai Yang
Z. Wang
Yankai Lin
Yong Liu
ELM
90
1
0
03 Feb 2025
Technical report on label-informed logit redistribution for better domain generalization in low-shot classification with foundation models
Technical report on label-informed logit redistribution for better domain generalization in low-shot classification with foundation models
Behraj Khan
T. Syed
86
1
0
29 Jan 2025
Reliable Text-to-SQL with Adaptive Abstention
Reliable Text-to-SQL with Adaptive Abstention
Kaiwen Chen
Yueting Chen
Xiaohui Yu
Nick Koudas
RALM
34
0
0
18 Jan 2025
Pretraining with random noise for uncertainty calibration
Pretraining with random noise for uncertainty calibration
Jeonghwan Cheon
Se-Bum Paik
OnRL
41
0
0
23 Dec 2024
A Survey of Calibration Process for Black-Box LLMs
A Survey of Calibration Process for Black-Box LLMs
Liangru Xie
Hui Liu
Jingying Zeng
Xianfeng Tang
Yan Han
Chen Luo
Jing Huang
Zhen Li
Suhang Wang
Qi He
74
1
0
17 Dec 2024
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token
Roi Cohen
Konstantin Dobler
Eden Biran
Gerard de Melo
83
3
0
09 Dec 2024
A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for
  Accelerating Large VLMs
A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Z. Li
Yibing Song
K. Wang
Zhangyang Wang
Yang You
77
7
0
04 Dec 2024
Enhancing Trust in Large Language Models with Uncertainty-Aware
  Fine-Tuning
Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning
R. Krishnan
Piyush Khanna
Omesh Tickoo
HILM
69
1
0
03 Dec 2024
Is my Meeting Summary Good? Estimating Quality with a Multi-LLM
  Evaluator
Is my Meeting Summary Good? Estimating Quality with a Multi-LLM Evaluator
Frederic Kirstein
Terry Ruas
Bela Gipp
77
2
0
27 Nov 2024
Position Paper On Diagnostic Uncertainty Estimation from Large Language
  Models: Next-Word Probability Is Not Pre-test Probability
Position Paper On Diagnostic Uncertainty Estimation from Large Language Models: Next-Word Probability Is Not Pre-test Probability
Yanjun Gao
Skatje Myers
Shan Chen
Dmitriy Dligach
Timothy A. Miller
Danielle S. Bitterman
Guanhua Chen
Anoop Mayampurath
M. Churpek
Majid Afshar
LM&MA
37
1
0
07 Nov 2024
A Survey of Uncertainty Estimation in LLMs: Theory Meets Practice
A Survey of Uncertainty Estimation in LLMs: Theory Meets Practice
Hsiu-Yuan Huang
Yutong Yang
Zhaoxi Zhang
Sanwoo Lee
Yunfang Wu
30
9
0
20 Oct 2024
FIRE: Fact-checking with Iterative Retrieval and Verification
FIRE: Fact-checking with Iterative Retrieval and Verification
Zhuohan Xie
Rui Xing
Yuxia Wang
Jiahui Geng
Hasan Iqbal
Dhruv Sahnan
Iryna Gurevych
Preslav Nakov
HILM
50
2
0
17 Oct 2024
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
Jihan Yao
Wenxuan Ding
Shangbin Feng
Lucy Lu Wang
Yulia Tsvetkov
25
0
0
14 Oct 2024
On Unsupervised Prompt Learning for Classification with Black-box
  Language Models
On Unsupervised Prompt Learning for Classification with Black-box Language Models
Zhen-Yu Zhang
Jiandong Zhang
Huaxiu Yao
Gang Niu
Masashi Sugiyama
21
2
0
04 Oct 2024
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
Jiyeon Kim
Hyunji Lee
Hyowon Cho
Joel Jang
Hyeonbin Hwang
Seungpil Won
Youbin Ahn
Dohaeng Lee
Minjoon Seo
KELM
58
2
0
02 Oct 2024
A Survey on the Honesty of Large Language Models
A Survey on the Honesty of Large Language Models
Siheng Li
Cheng Yang
Taiqiang Wu
Chufan Shi
Yuji Zhang
...
Jie Zhou
Yujiu Yang
Ngai Wong
Xixin Wu
Wai Lam
HILM
27
4
0
27 Sep 2024
The Factuality of Large Language Models in the Legal Domain
The Factuality of Large Language Models in the Legal Domain
Rajaa El Hamdani
Thomas Bonald
Fragkiskos D. Malliaros
Nils Holzenberger
Fabian M. Suchanek
AILaw
HILM
29
0
0
18 Sep 2024
Can Unconfident LLM Annotations Be Used for Confident Conclusions?
Can Unconfident LLM Annotations Be Used for Confident Conclusions?
Kristina Gligorić
Tijana Zrnic
Cinoo Lee
Emmanuel J. Candès
Dan Jurafsky
66
4
0
27 Aug 2024
Reference-free Hallucination Detection for Large Vision-Language Models
Reference-free Hallucination Detection for Large Vision-Language Models
Qing Li
Chenyang Lyu
Jiahui Geng
Derui Zhu
Maxim Panov
Fakhri Karray
24
6
0
11 Aug 2024
On the attribution of confidence to large language models
On the attribution of confidence to large language models
Geoff Keeling
Winnie Street
LRM
18
2
0
11 Jul 2024
What Did I Do Wrong? Quantifying LLMs' Sensitivity and Consistency to Prompt Engineering
What Did I Do Wrong? Quantifying LLMs' Sensitivity and Consistency to Prompt Engineering
Federico Errica
G. Siracusano
D. Sanvito
Roberto Bifulco
72
19
0
18 Jun 2024
Fact-Checking the Output of Large Language Models via Token-Level
  Uncertainty Quantification
Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification
Ekaterina Fadeeva
Aleksandr Rubashevskii
Artem Shelmanov
Sergey Petrakov
Haonan Li
...
Gleb Kuzmin
Alexander Panchenko
Timothy Baldwin
Preslav Nakov
Maxim Panov
HILM
40
38
0
07 Mar 2024
Calibrating Large Language Models with Sample Consistency
Calibrating Large Language Models with Sample Consistency
Qing Lyu
Kumar Shridhar
Chaitanya Malaviya
Li Zhang
Yanai Elazar
Niket Tandon
Marianna Apidianaki
Mrinmaya Sachan
Chris Callison-Burch
41
22
0
21 Feb 2024
Factuality of Large Language Models in the Year 2024
Factuality of Large Language Models in the Year 2024
Yuxia Wang
Minghan Wang
Muhammad Arslan Manzoor
Fei Liu
Georgi Georgiev
Rocktim Jyoti Das
Preslav Nakov
LRM
HILM
30
20
0
04 Feb 2024
On the Calibration of Large Language Models and Alignment
On the Calibration of Large Language Models and Alignment
Chiwei Zhu
Benfeng Xu
Quan Wang
Yongdong Zhang
Zhendong Mao
66
32
0
22 Nov 2023
Quantifying Uncertainty in Answers from any Language Model and Enhancing
  their Trustworthiness
Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness
Jiuhai Chen
Jonas W. Mueller
42
55
0
30 Aug 2023
Can Large Language Models Capture Dissenting Human Voices?
Can Large Language Models Capture Dissenting Human Voices?
Noah Lee
Na Min An
James Thorne
ALM
37
30
0
23 May 2023
The Internal State of an LLM Knows When It's Lying
The Internal State of an LLM Knows When It's Lying
A. Azaria
Tom Michael Mitchell
HILM
216
298
0
26 Apr 2023
Stop Measuring Calibration When Humans Disagree
Stop Measuring Calibration When Humans Disagree
Joris Baan
Wilker Aziz
Barbara Plank
Raquel Fernández
16
53
0
28 Oct 2022
Uncertainty Quantification with Pre-trained Language Models: A
  Large-Scale Empirical Analysis
Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis
Yuxin Xiao
Paul Pu Liang
Umang Bhatt
W. Neiswanger
Ruslan Salakhutdinov
Louis-Philippe Morency
173
86
0
10 Oct 2022
Out-of-Distribution Detection and Selective Generation for Conditional
  Language Models
Out-of-Distribution Detection and Selective Generation for Conditional Language Models
Jie Jessie Ren
Jiaming Luo
Yao-Min Zhao
Kundan Krishna
Mohammad Saleh
Balaji Lakshminarayanan
Peter J. Liu
OODD
64
94
0
30 Sep 2022
Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A
  Prompt-Based Uncertainty Propagation Approach
Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach
Yue Yu
Rongzhi Zhang
Ran Xu
Jieyu Zhang
Jiaming Shen
Chao Zhang
42
21
0
15 Sep 2022
What is Flagged in Uncertainty Quantification? Latent Density Models for
  Uncertainty Categorization
What is Flagged in Uncertainty Quantification? Latent Density Models for Uncertainty Categorization
Hao Sun
B. V. Breugel
Jonathan Crabbé
Nabeel Seedat
M. Schaar
22
4
0
11 Jul 2022
Re-Examining Calibration: The Case of Question Answering
Re-Examining Calibration: The Case of Question Answering
Chenglei Si
Chen Zhao
Sewon Min
Jordan L. Boyd-Graber
51
30
0
25 May 2022
Prototypical Calibration for Few-shot Learning of Language Models
Prototypical Calibration for Few-shot Learning of Language Models
Zhixiong Han
Y. Hao
Li Dong
Yutao Sun
Furu Wei
168
52
0
20 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,402
0
28 Jan 2022
Reducing conversational agents' overconfidence through linguistic
  calibration
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
209
152
0
30 Dec 2020
12
Next