ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14975
  4. Cited By
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence
  Scores from Language Models Fine-Tuned with Human Feedback

Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback

24 May 2023
Katherine Tian
E. Mitchell
Allan Zhou
Archit Sharma
Rafael Rafailov
Huaxiu Yao
Chelsea Finn
Christopher D. Manning
ArXivPDFHTML

Papers citing "Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback"

50 / 228 papers shown
Title
Empowering Biomedical Discovery with AI Agents
Empowering Biomedical Discovery with AI Agents
Shanghua Gao
Ada Fang
Yepeng Huang
Valentina Giunchiglia
Ayush Noori
Jonathan Richard Schwarz
Yasha Ektefaie
Jovana Kondic
Marinka Zitnik
LLMAG
AI4CE
39
66
0
03 Apr 2024
Calibrating the Confidence of Large Language Models by Eliciting
  Fidelity
Calibrating the Confidence of Large Language Models by Eliciting Fidelity
Mozhi Zhang
Mianqiu Huang
Rundong Shi
Linsen Guo
Chong Peng
Peng Yan
Yaqian Zhou
Xipeng Qiu
22
10
0
03 Apr 2024
Harnessing the Power of Large Language Model for Uncertainty Aware Graph Processing
Zhenyu Qian
Yiming Qian
Yuting Song
Fei Gao
Hai Jin
Chen Yu
Xia Xie
37
0
0
31 Mar 2024
Rejection Improves Reliability: Training LLMs to Refuse Unknown
  Questions Using RL from Knowledge Feedback
Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback
Hongshen Xu
Zichen Zhu
Situo Zhang
Da Ma
Shuai Fan
Lu Chen
Kai Yu
HILM
29
32
0
27 Mar 2024
Few-Shot Recalibration of Language Models
Few-Shot Recalibration of Language Models
Xiang Lisa Li
Urvashi Khandelwal
Kelvin Guu
39
5
0
27 Mar 2024
Third-Party Language Model Performance Prediction from Instruction
Third-Party Language Model Performance Prediction from Instruction
Rahul Nadkarni
Yizhong Wang
Noah A. Smith
ELM
LRM
39
0
0
19 Mar 2024
Think Twice Before Trusting: Self-Detection for Large Language Models
  through Comprehensive Answer Reflection
Think Twice Before Trusting: Self-Detection for Large Language Models through Comprehensive Answer Reflection
Moxin Li
Wenjie Wang
Fuli Feng
Fengbin Zhu
Qifan Wang
Tat-Seng Chua
HILM
LRM
33
13
0
15 Mar 2024
Couler: Unified Machine Learning Workflow Optimization in Cloud
Couler: Unified Machine Learning Workflow Optimization in Cloud
Xiaoda Wang
Yuan-ju Tang
Tengda Guo
Bo Sang
Jingji Wu
Jian Sha
Ke Zhang
Jiang Qian
Mingjie Tang
25
0
0
12 Mar 2024
Calibrating Large Language Models Using Their Generations Only
Calibrating Large Language Models Using Their Generations Only
Dennis Ulmer
Martin Gubri
Hwaran Lee
Sangdoo Yun
Seong Joon Oh
UQLM
411
18
1
09 Mar 2024
Bayesian Preference Elicitation with Language Models
Bayesian Preference Elicitation with Language Models
Kunal Handa
Yarin Gal
Ellie Pavlick
Noah D. Goodman
Jacob Andreas
Alex Tamkin
Belinda Z. Li
23
12
0
08 Mar 2024
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Unfamiliar Finetuning Examples Control How Language Models Hallucinate
Katie Kang
Eric Wallace
Claire Tomlin
Aviral Kumar
Sergey Levine
HILM
LRM
35
49
0
08 Mar 2024
LLMs for Targeted Sentiment in News Headlines: Exploring the
  Descriptive-Prescriptive Dilemma
LLMs for Targeted Sentiment in News Headlines: Exploring the Descriptive-Prescriptive Dilemma
Jana Juros
Laura Majer
Jan Snajder
31
2
0
01 Mar 2024
Characterizing Truthfulness in Large Language Model Generations with
  Local Intrinsic Dimension
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension
Fan Yin
Jayanth Srinivasa
Kai-Wei Chang
HILM
52
19
0
28 Feb 2024
On the Challenges and Opportunities in Generative AI
On the Challenges and Opportunities in Generative AI
Laura Manduchi
Kushagra Pandey
Robert Bamler
Ryan Cotterell
Sina Daubener
...
F. Wenzel
Frank Wood
Stephan Mandt
Vincent Fortuin
Vincent Fortuin
56
17
0
28 Feb 2024
Predict the Next Word: Humans exhibit uncertainty in this task and
  language models _____
Predict the Next Word: Humans exhibit uncertainty in this task and language models _____
Evgenia Ilia
Wilker Aziz
21
2
0
27 Feb 2024
Fact-and-Reflection (FaR) Improves Confidence Calibration of Large
  Language Models
Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models
Xinran Zhao
Hongming Zhang
Xiaoman Pan
Wenlin Yao
Dong Yu
Tongshuang Wu
Jianshu Chen
HILM
LRM
27
4
0
27 Feb 2024
$C^3$: Confidence Calibration Model Cascade for Inference-Efficient
  Cross-Lingual Natural Language Understanding
C3C^3C3: Confidence Calibration Model Cascade for Inference-Efficient Cross-Lingual Natural Language Understanding
Taixi Lu
Haoyu Wang
Huajie Shao
Jing Gao
Huaxiu Yao
33
0
0
25 Feb 2024
Selective "Selective Prediction": Reducing Unnecessary Abstention in
  Vision-Language Reasoning
Selective "Selective Prediction": Reducing Unnecessary Abstention in Vision-Language Reasoning
Tejas Srinivasan
Jack Hessel
Tanmay Gupta
Bill Yuchen Lin
Yejin Choi
Jesse Thomason
Khyathi Raghavi Chandu
24
7
0
23 Feb 2024
Soft Self-Consistency Improves Language Model Agents
Soft Self-Consistency Improves Language Model Agents
Han Wang
Archiki Prasad
Elias Stengel-Eskin
Mohit Bansal
LLMAG
24
7
0
20 Feb 2024
Thermometer: Towards Universal Calibration for Large Language Models
Thermometer: Towards Universal Calibration for Large Language Models
Maohao Shen
Subhro Das
Kristjan Greenewald
P. Sattigeri
Greg Wornell
Soumya Ghosh
59
9
0
20 Feb 2024
Uncertainty quantification in fine-tuned LLMs using LoRA ensembles
Uncertainty quantification in fine-tuned LLMs using LoRA ensembles
Oleksandr Balabanov
H. Linander
UQCV
25
13
0
19 Feb 2024
Don't Go To Extremes: Revealing the Excessive Sensitivity and
  Calibration Limitations of LLMs in Implicit Hate Speech Detection
Don't Go To Extremes: Revealing the Excessive Sensitivity and Calibration Limitations of LLMs in Implicit Hate Speech Detection
Min Zhang
Jianfeng He
Taoran Ji
Chang-Tien Lu
19
11
0
18 Feb 2024
Multi-Perspective Consistency Enhances Confidence Estimation in Large
  Language Models
Multi-Perspective Consistency Enhances Confidence Estimation in Large Language Models
Pei Wang
Yejie Wang
Muxi Diao
Keqing He
Guanting Dong
Weiran Xu
14
0
0
17 Feb 2024
Retrieve Only When It Needs: Adaptive Retrieval Augmentation for
  Hallucination Mitigation in Large Language Models
Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models
Hanxing Ding
Liang Pang
Zihao Wei
Huawei Shen
Xueqi Cheng
HILM
RALM
67
15
0
16 Feb 2024
DELL: Generating Reactions and Explanations for LLM-Based Misinformation
  Detection
DELL: Generating Reactions and Explanations for LLM-Based Misinformation Detection
Herun Wan
Shangbin Feng
Zhaoxuan Tan
Heng Wang
Yulia Tsvetkov
Minnan Luo
68
29
0
16 Feb 2024
Language Models with Conformal Factuality Guarantees
Language Models with Conformal Factuality Guarantees
Christopher Mohri
Tatsunori Hashimoto
HILM
39
33
0
15 Feb 2024
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via
  Self-Evaluation
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Lifeng Jin
Linfeng Song
Haitao Mi
Helen Meng
HILM
28
43
0
14 Feb 2024
Understanding the Effects of Iterative Prompting on Truthfulness
Understanding the Effects of Iterative Prompting on Truthfulness
Satyapriya Krishna
Chirag Agarwal
Himabindu Lakkaraju
HILM
17
9
0
09 Feb 2024
Calibrating Long-form Generations from Large Language Models
Calibrating Long-form Generations from Large Language Models
Yukun Huang
Yixin Liu
Raghuveer Thirukovalluru
Arman Cohan
Bhuwan Dhingra
19
7
0
09 Feb 2024
NoisyICL: A Little Noise in Model Parameters Calibrates In-context
  Learning
NoisyICL: A Little Noise in Model Parameters Calibrates In-context Learning
Yufeng Zhao
Yoshihiro Sakai
Naoya Inoue
31
3
0
08 Feb 2024
Reconfidencing LLMs from the Grouping Loss Perspective
Reconfidencing LLMs from the Grouping Loss Perspective
Lihu Chen
Alexandre Perez-Lebel
Fabian M. Suchanek
Gaël Varoquaux
163
8
0
07 Feb 2024
ANLS* -- A Universal Document Processing Metric for Generative Large
  Language Models
ANLS* -- A Universal Document Processing Metric for Generative Large Language Models
David Peer
Philemon Schöpf
V. Nebendahl
A. Rietzler
Sebastian Stabinger
17
3
0
06 Feb 2024
Distinguishing the Knowable from the Unknowable with Language Models
Distinguishing the Knowable from the Unknowable with Language Models
Gustaf Ahdritz
Tian Qin
Nikhil Vyas
Boaz Barak
Benjamin L. Edelman
24
18
0
05 Feb 2024
Deal, or no deal (or who knows)? Forecasting Uncertainty in
  Conversations using Large Language Models
Deal, or no deal (or who knows)? Forecasting Uncertainty in Conversations using Large Language Models
Anthony Sicilia
Hyunwoo J. Kim
Khyathi Raghavi Chandu
Malihe Alikhani
Jack Hessel
16
1
0
05 Feb 2024
Calibration and Correctness of Language Models for Code
Calibration and Correctness of Language Models for Code
Claudio Spiess
David Gros
Kunal Suresh Pai
Michael Pradel
Md Rafiqul Islam Rabin
Amin Alipour
Susmit Jha
Prem Devanbu
Toufique Ahmed
58
19
0
03 Feb 2024
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM
  Collaboration
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration
Shangbin Feng
Weijia Shi
Yike Wang
Wenxuan Ding
Vidhisha Balachandran
Yulia Tsvetkov
25
77
0
01 Feb 2024
Towards Uncertainty-Aware Language Agent
Towards Uncertainty-Aware Language Agent
Jiuzhou Han
Wray L. Buntine
Ehsan Shareghi
LLMAG
AI4CE
21
4
0
25 Jan 2024
Combining Confidence Elicitation and Sample-based Methods for
  Uncertainty Quantification in Misinformation Mitigation
Combining Confidence Elicitation and Sample-based Methods for Uncertainty Quantification in Misinformation Mitigation
Mauricio Rivera
Jean-François Godbout
Reihaneh Rabbany
Kellin Pelrine
HILM
13
9
0
13 Jan 2024
Relying on the Unreliable: The Impact of Language Models' Reluctance to
  Express Uncertainty
Relying on the Unreliable: The Impact of Language Models' Reluctance to Express Uncertainty
Kaitlyn Zhou
Jena D. Hwang
Xiang Ren
Maarten Sap
28
54
0
12 Jan 2024
Large Language Models for Social Networks: Applications, Challenges, and
  Solutions
Large Language Models for Social Networks: Applications, Challenges, and Solutions
Jingying Zeng
Richard Huang
Waleed Malik
Langxuan Yin
Bojan Babic
Danny Shacham
Xiao Yan
Jaewon Yang
Qi He
22
6
0
04 Jan 2024
Large Legal Fictions: Profiling Legal Hallucinations in Large Language
  Models
Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models
Matthew Dahl
Varun Magesh
Mirac Suzgun
Daniel E. Ho
HILM
AILaw
25
72
0
02 Jan 2024
LLM Factoscope: Uncovering LLMs' Factual Discernment through Inner
  States Analysis
LLM Factoscope: Uncovering LLMs' Factual Discernment through Inner States Analysis
Jinwen He
Yujia Gong
Kai-xiang Chen
Zijin Lin
Chengán Wei
Yue Zhao
19
3
0
27 Dec 2023
Self-Evaluation Improves Selective Generation in Large Language Models
Self-Evaluation Improves Selective Generation in Large Language Models
Jie Jessie Ren
Yao-Min Zhao
Tu Vu
Peter J. Liu
Balaji Lakshminarayanan
ELM
23
34
0
14 Dec 2023
On Diversified Preferences of Large Language Model Alignment
On Diversified Preferences of Large Language Model Alignment
Dun Zeng
Yong Dai
Pengyu Cheng
Longyue Wang
Tianhao Hu
Wanshun Chen
Nan Du
Zenglin Xu
ALM
30
16
0
12 Dec 2023
Alignment for Honesty
Alignment for Honesty
Yuqing Yang
Ethan Chern
Xipeng Qiu
Graham Neubig
Pengfei Liu
31
28
0
12 Dec 2023
A Study on the Calibration of In-context Learning
A Study on the Calibration of In-context Learning
Hanlin Zhang
Yi-Fan Zhang
Yaodong Yu
Dhruv Madeka
Dean Phillips Foster
Eric Xing
Hima Lakkaraju
Sham Kakade
21
7
0
07 Dec 2023
Transfer Attacks and Defenses for Large Language Models on Coding Tasks
Transfer Attacks and Defenses for Large Language Models on Coding Tasks
Chi Zhang
Zifan Wang
Ravi Mangal
Matt Fredrikson
Limin Jia
Corina S. Pasareanu
AAML
SILM
17
1
0
22 Nov 2023
On the Calibration of Large Language Models and Alignment
On the Calibration of Large Language Models and Alignment
Chiwei Zhu
Benfeng Xu
Quan Wang
Yongdong Zhang
Zhendong Mao
69
32
0
22 Nov 2023
Program-Aided Reasoners (better) Know What They Know
Program-Aided Reasoners (better) Know What They Know
Anubha Kabra
Sanketh Rangreji
Yash Mathur
Aman Madaan
Emmy Liu
Graham Neubig
LRM
19
0
0
16 Nov 2023
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations
Wenjie Mo
Jiashu Xu
Qin Liu
Jiong Wang
Jun Yan
Chaowei Xiao
Muhao Chen
Muhao Chen
AAML
54
17
0
16 Nov 2023
Previous
12345
Next