Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.13904
Cited By
Calibrating Large Language Models with Sample Consistency
21 February 2024
Qing Lyu
Kumar Shridhar
Chaitanya Malaviya
Li Zhang
Yanai Elazar
Niket Tandon
Marianna Apidianaki
Mrinmaya Sachan
Chris Callison-Burch
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Calibrating Large Language Models with Sample Consistency"
24 / 24 papers shown
Title
Delusions of Large Language Models
Hongshen Xu
Zixv yang
Zichen Zhu
Kunyao Lan
Zihan Wang
Mengyue Wu
Ziwei Ji
L. Chen
Pascale Fung
Kai Yu
LRM
HILM
42
0
0
09 Mar 2025
Alignment for Efficient Tool Calling of Large Language Models
Hongshen Xu
Zihan Wang
Zichen Zhu
Lei Pan
Xingyu Chen
L. Chen
Kai Yu
39
0
0
09 Mar 2025
The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
Richard Ren
Arunim Agarwal
Mantas Mazeika
Cristina Menghini
Robert Vacareanu
...
Matias Geralnik
Adam Khoja
Dean Lee
Summer Yue
Dan Hendrycks
HILM
ALM
80
0
0
05 Mar 2025
A Survey of Uncertainty Estimation Methods on Large Language Models
Zhiqiu Xia
Jinxuan Xu
Yuqian Zhang
Hang Liu
26
1
0
28 Feb 2025
Revisiting Self-Consistency from Dynamic Distributional Alignment Perspective on Answer Aggregation
Yiwei Li
Ji Zhang
Shaoxiong Feng
Peiwen Yuan
X. Wang
...
Y. Zhang
Chuyi Tan
Boyuan Pan
Yao Hu
Kan Li
HILM
32
1
0
27 Feb 2025
LLMs Can Teach Themselves to Better Predict the Future
Benjamin Turtel
Danny Franklin
Philipp Schoenegger
LRM
49
0
0
07 Feb 2025
UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models
Boyang Xue
Fei Mi
Qi Zhu
Hongru Wang
Rui Wang
Sheng Wang
Erxin Yu
Xuming Hu
Kam-Fai Wong
HILM
67
0
0
16 Dec 2024
SIKeD: Self-guided Iterative Knowledge Distillation for mathematical reasoning
Shivam Adarsh
Kumar Shridhar
Caglar Gulcehre
Nicholas Monath
Mrinmaya Sachan
LRM
19
2
0
24 Oct 2024
Navigating Noisy Feedback: Enhancing Reinforcement Learning with Error-Prone Language Models
Muhan Lin
Shuyang Shi
Yue (Sophie) Guo
Behdad Chalaki
Vaishnav Tadiparthi
Ehsan Moradi-Pari
Simon Stepputtis
Joseph Campbell
Katia P. Sycara
28
0
0
22 Oct 2024
MlingConf: A Comprehensive Study of Multilingual Confidence Estimation on Large Language Models
Boyang Xue
Hongru Wang
Rui Wang
Sheng Wang
Zezhong Wang
Yiming Du
Bin Liang
Kam-Fai Wong
14
0
0
16 Oct 2024
Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models
Bozhou Li
Hao Liang
Yang Li
Fangcheng Fu
Hongzhi Yin
Conghui He
Wentao Zhang
KELM
CLL
33
0
0
08 Oct 2024
Mirror-Consistency: Harnessing Inconsistency in Majority Voting
Siyuan Huang
Zhiyuan Ma
Jintao Du
Changhua Meng
Weiqiang Wang
Zhouhan Lin
LRM
13
0
0
07 Oct 2024
A Survey on the Honesty of Large Language Models
Siheng Li
Cheng Yang
Taiqiang Wu
Chufan Shi
Yuji Zhang
...
Jie Zhou
Yujiu Yang
Ngai Wong
Xixin Wu
Wai Lam
HILM
12
2
0
27 Sep 2024
Do Not Design, Learn: A Trainable Scoring Function for Uncertainty Estimation in Generative LLMs
D. Yaldiz
Yavuz Faruk Bakman
Baturalp Buyukates
Chenyang Tao
Anil Ramakrishna
Dimitrios Dimitriadis
Jieyu Zhao
Salman Avestimehr
30
1
0
17 Jun 2024
SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales
Tianyang Xu
Shujin Wu
Shizhe Diao
Xiaoze Liu
Xingyao Wang
Yangyi Chen
Jing Gao
LRM
14
5
0
31 May 2024
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Zorik Gekhman
G. Yona
Roee Aharoni
Matan Eyal
Amir Feder
Roi Reichart
Jonathan Herzig
46
98
0
09 May 2024
DyKnow:Dynamically Verifying Time-Sensitive Factual Knowledge in LLMs
Seyed Mahed Mousavi
Simone Alghisi
Giuseppe Riccardi
KELM
22
5
0
10 Apr 2024
Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback
Hongshen Xu
Zichen Zhu
Situo Zhang
Da Ma
Shuai Fan
Lu Chen
Kai Yu
HILM
21
5
0
27 Mar 2024
First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning
Kushal Kumar Jain
Moritz Miller
Niket Tandon
Kumar Shridhar
ReLM
LRM
22
2
0
14 Nov 2023
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
297
3,163
0
21 Mar 2022
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
242
460
0
06 Jan 2021
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
193
108
0
30 Dec 2020
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
Balaji Lakshminarayanan
Alexander Pritzel
Charles Blundell
UQCV
BDL
268
4,940
0
05 Dec 2016
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Y. Gal
Zoubin Ghahramani
UQCV
BDL
243
9,042
0
06 Jun 2015
1