Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.08896
Cited By
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
15 March 2023
Potsawee Manakul
Adian Liusie
Mark J. F. Gales
HILM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models"
12 / 12 papers shown
Title
Consistency in Language Models: Current Landscape, Challenges, and Future Directions
Jekaterina Novikova
Carol Anderson
Borhane Blili-Hamelin
Subhabrata Majumdar
HILM
27
0
0
01 May 2025
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers
Dylan Bouchard
Mohit Singh Chauhan
HILM
37
46
0
27 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
50
0
0
25 Apr 2025
Gauging Overprecision in LLMs: An Empirical Study
Adil Bahaj
Hamed Rahimi
Mohamed Chetouani
Mounir Ghogho
27
0
0
16 Apr 2025
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
Yifei Ming
Senthil Purushwalkam
Shrey Pandit
Zixuan Ke
Xuan-Phi Nguyen
Caiming Xiong
Shafiq R. Joty
HILM
70
15
0
30 Sep 2024
Ragas: Automated Evaluation of Retrieval Augmented Generation
ES Shahul
Jithin James
Luis Espinosa-Anke
Steven Schockaert
31
169
0
26 Sep 2023
The Internal State of an LLM Knows When It's Lying
A. Azaria
Tom Michael Mitchell
HILM
176
192
0
26 Apr 2023
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
247
2,029
0
21 Mar 2022
The Factual Inconsistency Problem in Abstractive Text Summarization: A Survey
Yi-Chong Huang
Xiachong Feng
Xiaocheng Feng
Bing Qin
HILM
88
90
0
30 Apr 2021
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu Liu
Yizhe Zhang
Chris Brockett
Yi Mao
Zhifang Sui
Weizhu Chen
W. Dolan
HILM
186
109
0
18 Apr 2021
Reasoning Over Semantic-Level Graph for Fact Checking
Wanjun Zhong
Jingjing Xu
Duyu Tang
Zenan Xu
Nan Duan
M. Zhou
Jiahai Wang
Jian Yin
HILM
GNN
151
145
0
09 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
267
6,003
0
20 Apr 2018
1