Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2312.09300
Cited By
Self-Evaluation Improves Selective Generation in Large Language Models
14 December 2023
Jie Jessie Ren
Yao-Min Zhao
Tu Vu
Peter J. Liu
Balaji Lakshminarayanan
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (16 upvotes)
Papers citing
"Self-Evaluation Improves Selective Generation in Large Language Models"
37 / 37 papers shown
Title
Can LLMs Make (Personalized) Access Control Decisions?
Friederike Groschupp
Daniele Lain
Aritra Dhar
Lara Magdalena Lazier
Srdjan Capkun
20
1
0
25 Nov 2025
Large Language Models for Explainable Threat Intelligence
Tiago Dinis
Miguel Correia
Roger Tavares
108
0
0
07 Nov 2025
E-CARE: An Efficient LLM-based Commonsense-Augmented Framework for E-Commerce
Ge Zhang
Rohan Deepak Ajwani
Tony Zheng
Hongjian Gu
Yaochen Hu
Wei Guo
Mark Coates
Yingxue Zhang
LRM
108
0
0
06 Nov 2025
Efficient semantic uncertainty quantification in language models via diversity-steered sampling
Ji Won Park
K. Cho
74
0
0
24 Oct 2025
Latent Thinking Optimization: Your Latent Reasoning Language Model Secretly Encodes Reward Signals in Its Latent Thoughts
Hanwen Du
Yuxin Dong
Xia Ning
LRM
AI4CE
74
1
0
30 Sep 2025
DSCC-HS: A Dynamic Self-Reinforcing Framework for Hallucination Suppression in Large Language Models
Xiao Zheng
HILM
88
0
0
17 Sep 2025
Maestro: Self-Improving Text-to-Image Generation via Agent Orchestration
Xingchen Wan
Han Zhou
Ruoxi Sun
Hootan Nakhost
Ke Jiang
Rajarishi Sinha
Sercan Ö. Arık
156
2
0
12 Sep 2025
HalluField: Detecting LLM Hallucinations via Field-Theoretic Modeling
Minh Nhat Vu
Brian K. Tran
Syed A. Shah
Geigh Zollicoffer
N. Hoang-Xuan
Manish Bhattarai
96
0
0
12 Sep 2025
Deep Think with Confidence
Yichao Fu
Xuewei Wang
Yuandong Tian
Jiawei Zhao
ReLM
BDL
LRM
126
51
0
21 Aug 2025
Explainer-guided Targeted Adversarial Attacks against Binary Code Similarity Detection Models
Mingjie Chen
Tiancheng Zhu
Mingxue Zhang
Yiling He
Minghao Lin
Penghui Li
Kui Ren
AAML
163
8
0
05 Jun 2025
Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations
Jinyuan Luo
Zhen Fang
Shouqing Yang
Seongheon Park
Ling Chen
AAML
HILM
181
0
0
03 Jun 2025
GenCLS++: Pushing the Boundaries of Generative Classification in LLMs Through Comprehensive SFT and RL Studies Across Diverse Datasets
Mingqian He
Fei Zhao
Chonggang Lu
Ziqiang Liu
Yun Wang
Haofu Qian
OffRL
AI4TS
VLM
211
3
0
28 Apr 2025
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers
Dylan Bouchard
Mohit Singh Chauhan
HILM
470
4
0
27 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
563
2
0
25 Apr 2025
Agentic Keyframe Search for Video Question Answering
Sunqi Fan
Meng-Hao Guo
Shuojin Yang
169
3
0
20 Mar 2025
Scalable Best-of-N Selection for Large Language Models via Self-Certainty
Zhewei Kang
Xuandong Zhao
Dawn Song
LRM
217
58
0
25 Feb 2025
LOVA3: Learning to Visual Question Answering, Asking and Assessment
Neural Information Processing Systems (NeurIPS), 2024
Henry Hengyuan Zhao
Pan Zhou
Difei Gao
Zechen Bai
Mike Zheng Shou
358
13
0
21 Feb 2025
Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Daniel J.H. Chung
Zhiqi Gao
Yurii Kvasiuk
Tianyi Li
Moritz Münchmeyer
Maja Rudolph
Frederic Sala
Sai Chaitanya Tadepalli
AIMat
205
16
0
19 Feb 2025
BioAgents: Democratizing Bioinformatics Analysis with Multi-Agent Systems
Nikita Mehandru
Amanda K. Hall
Olesya Melnichenko
Yulia Dubinina
Daniel Tsirulnikov
David Bamman
Ahmed Alaa
Scott Saponas
Venkat S. Malladi
185
10
0
10 Jan 2025
Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding
Nabeel Seedat
Caterina Tozzi
Andrea Hita Ardiaca
Mihaela van der Schaar
James Weatherall
Adam Taylor
978
0
0
20 Nov 2024
Matchmaker: Self-Improving Large Language Model Programs for Schema Matching
Nabeel Seedat
Mihaela van der Schaar
168
9
0
31 Oct 2024
Interpretable Contrastive Monte Carlo Tree Search Reasoning
Zitian Gao
Boye Niu
Xuzheng He
Haotian Xu
Hongzhang Liu
Aiwei Liu
Xuming Hu
Lijie Wen
LRM
373
59
0
02 Oct 2024
A Survey on the Honesty of Large Language Models
Siheng Li
Cheng Yang
Taiqiang Wu
Chufan Shi
Yuji Zhang
...
Jie Zhou
Yujiu Yang
Ngai Wong
Xixin Wu
Wai Lam
HILM
262
15
0
27 Sep 2024
HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection
Neural Information Processing Systems (NeurIPS), 2024
Xuefeng Du
Chaowei Xiao
Yixuan Li
HILM
216
56
0
26 Sep 2024
Doppelgänger's Watch: A Split Objective Approach to Large Language Models
S. Ghasemlou
Ashish Katiyar
Aparajita Saraf
Seungwhan Moon
Mangesh Pujari
Pinar E. Donmez
Babak Damavandi
Anuj Kumar
160
0
0
09 Sep 2024
Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in the Environmental and Climate Change Domain
International Conference on Applications of Natural Language to Data Bases (NLDB), 2024
Francesca Grasso
Stefano Locci
138
6
0
30 Aug 2024
DataGen: Unified Synthetic Dataset Generation via Large Language Models
IEEE International Joint Conference on Neural Network (IJCNN), 2025
Yue Huang
Siyuan Wu
Chujie Gao
Dongping Chen
Qihui Zhang
...
Tianyi Zhou
Xiangliang Zhang
Jianfeng Gao
Chaowei Xiao
Lichao Sun
SyDa
433
22
0
27 Jun 2024
Autonomous Prompt Engineering in Large Language Models
Daan Kepel
Konstantina Valogianni
LLMAG
219
13
0
25 Jun 2024
Harnessing AI for efficient analysis of complex policy documents: a case study of Executive Order 14110
Mark A. Kramer
Allen Leavens
Alexander Scarlat
46
1
0
10 Jun 2024
On Subjective Uncertainty Quantification and Calibration in Natural Language Generation
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Ziyu Wang
Chris Holmes
UQLM
382
8
0
07 Jun 2024
Improving Uncertainty Estimation through Semantically Diverse Language Generation
International Conference on Learning Representations (ICLR), 2024
L. Aichberger
Kajetan Schweighofer
Mykyta Ielanskyi
Sepp Hochreiter
HILM
236
18
0
06 Jun 2024
Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller
Min Cai
Yuchen Zhang
Shichang Zhang
Fan Yin
Difan Zou
Yisong Yue
Ziniu Hu
270
3
0
04 Jun 2024
Evaluating Uncertainty-based Failure Detection for Closed-Loop LLM Planners
Zhi Zheng
Qian Feng
Hang Li
Alois C. Knoll
Jianxiang Feng
412
13
0
01 Jun 2024
Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities
Alexander Nikitin
Jannik Kossen
Yarin Gal
Pekka Marttinen
UQCV
350
85
0
30 May 2024
LLMs can learn self-restraint through iterative self-reflection
Alexandre Piché
Aristides Milios
Dzmitry Bahdanau
Chris Pal
246
6
0
15 May 2024
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning
Yuxi Xie
Anirudh Goyal
Wenyue Zheng
Min-Yen Kan
Timothy Lillicrap
Kenji Kawaguchi
Michael Shieh
ReLM
LRM
346
188
0
01 May 2024
Stealing Part of a Production Language Model
International Conference on Machine Learning (ICML), 2024
Nicholas Carlini
Daniel Paleka
Krishnamurthy Dvijotham
Thomas Steinke
Jonathan Hayase
...
Arthur Conmy
Itay Yona
Eric Wallace
David Rolnick
Florian Tramèr
MLAU
AAML
258
129
0
11 Mar 2024
1