SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for
Generative Large Language Models

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

15 March 2023

Potsawee Manakul

Mark J. F. Gales

Papers citing "SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models"

12 / 12 papers shown

Title
Consistency in Language Models: Current Landscape, Challenges, and Future Directions Jekaterina Novikova Carol Anderson Borhane Blili-Hamelin Subhabrata Majumdar HILM 27 0 0 01 May 2025
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers Dylan Bouchard Mohit Singh Chauhan HILM 37 46 0 27 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review Toghrul Abbasli Kentaroh Toyoda Yuan Wang Leon Witt Muhammad Asif Ali Yukai Miao Dan Li Qingsong Wei UQCV 50 0 0 25 Apr 2025
Gauging Overprecision in LLMs: An Empirical Study Adil Bahaj Hamed Rahimi Mohamed Chetouani Mounir Ghogho 27 0 0 16 Apr 2025
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows" Yifei Ming Senthil Purushwalkam Shrey Pandit Zixuan Ke Xuan-Phi Nguyen Caiming Xiong Shafiq R. Joty HILM 70 15 0 30 Sep 2024
Ragas: Automated Evaluation of Retrieval Augmented Generation ES Shahul Jithin James Luis Espinosa-Anke Steven Schockaert 31 169 0 26 Sep 2023
The Internal State of an LLM Knows When It's Lying A. Azaria Tom Michael Mitchell HILM 176 192 0 26 Apr 2023
Self-Consistency Improves Chain of Thought Reasoning in Language Models Xuezhi Wang Jason W. Wei Dale Schuurmans Quoc Le Ed H. Chi Sharan Narang Aakanksha Chowdhery Denny Zhou ReLM BDL LRM AI4CE 247 2,029 0 21 Mar 2022
The Factual Inconsistency Problem in Abstractive Text Summarization: A Survey Yi-Chong Huang Xiachong Feng Xiaocheng Feng Bing Qin HILM 88 90 0 30 Apr 2021
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation Tianyu Liu Yizhe Zhang Chris Brockett Yi Mao Zhifang Sui Weizhu Chen W. Dolan HILM 186 109 0 18 Apr 2021
Reasoning Over Semantic-Level Graph for Fact Checking Wanjun Zhong Jingjing Xu Duyu Tang Zenan Xu Nan Duan M. Zhou Jiahai Wang Jian Yin HILM GNN 151 145 0 09 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 267 6,003 0 20 Apr 2018