Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs

22 June 2024

Yarin Gal

Papers citing "Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs"

32 / 32 papers shown

Title
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers Dylan Bouchard Mohit Singh Chauhan HILM 70 0 0 27 Apr 2025
Hallucination Detection in LLMs via Topological Divergence on Attention Graphs Alexandra Bazarova Aleksandr Yugay Andrey Shulga A. Ermilova Andrei Volodichev ... Dmitry Simakov M. Savchenko Andrey Savchenko Serguei Barannikov Alexey Zaytsev HILM 21 0 0 14 Apr 2025
Enhancing Mathematical Reasoning in Large Language Models with Self-Consistency-Based Hallucination Detection MingShan Liu Shi Bo Jialing Fang LRM 22 0 0 13 Apr 2025
Robust Hallucination Detection in LLMs via Adaptive Token Selection Mengjia Niu Hamed Haddadi Guansong Pang HILM 53 0 0 10 Apr 2025
Hallucination Detection on a Budget: Efficient Bayesian Estimation of Semantic Entropy K. Ciosek Nicolò Felicioni Sina Ghiassian 24 0 0 04 Apr 2025
The Illusionist's Prompt: Exposing the Factual Vulnerabilities of Large Language Models with Linguistic Nuances Yining Wang Y. Wang Xi Li Mi Zhang Geng Hong Min Yang AAML HILM 60 0 0 01 Apr 2025
FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs Albert Sawczyn Jakub Binkowski Denis Janiak Bogdan Gabrys Tomasz Kajdanowicz HILM LRM 56 0 0 21 Mar 2025
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations Ziwei Ji L. Yu Yeskendir Koishekenov Yejin Bang Anthony Hartshorn Alan Schelten Cheng Zhang Pascale Fung Nicola Cancedda 46 1 0 18 Mar 2025
How to Steer LLM Latents for Hallucination Detection? Seongheon Park Xuefeng Du Min-Hsuan Yeh Haobo Wang Yixuan Li LLMSV 44 1 0 01 Mar 2025
Semantic Volume: Quantifying and Detecting both External and Internal Uncertainty in LLMs Xiaomin Li Zhou Yu Ziji Zhang Yingying Zhuang S. Narayanan Sadagopan Anurag Beniwal HILM 56 0 0 28 Feb 2025
Hallucination Detection in LLMs Using Spectral Features of Attention Maps Jakub Binkowski Denis Janiak Albert Sawczyn Bogdan Gabrys Tomasz Kajdanowicz 57 0 0 24 Feb 2025
What are Models Thinking about? Understanding Large Language Model Hallucinations "Psychology" through Model Inner State Analysis Peiran Wang Yang Liu Yunfei Lu Jue Hong Ye Wu HILM LRM 67 0 0 20 Feb 2025
Uncertainty-Aware Step-wise Verification with Generative Reward Models Zihuiwen Ye L. Melo Younesse Kaddar Phil Blunsom S. Kamath S Yarin Gal LRM 44 0 0 16 Feb 2025
Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training Shahrad Mohammadzadeh Juan David Guerra Marco Bonizzato Reihaneh Rabbany Golnoosh Farnadi HILM 49 0 0 08 Jan 2025
HalluCana: Fixing LLM Hallucination with A Canary Lookahead Tianyi Li Erenay Dayanik Shubhi Tyagi Andrea Pierleoni HILM 70 0 0 10 Dec 2024
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models Javier Ferrando Oscar Obeso Senthooran Rajamanoharan Neel Nanda 67 10 0 21 Nov 2024
LLM Hallucination Reasoning with Zero-shot Knowledge Test Seongmin Lee Hsiang Hsu Chun-Fu Chen LRM 39 2 0 14 Nov 2024
Fine-Tuning Large Language Models to Appropriately Abstain with Semantic Entropy Benedict Aaron Tjandra Muhammed Razzak Jannik Kossen Kunal Handa Yarin Gal HILM 28 0 0 22 Oct 2024
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability ZhongXiang Sun Xiaoxue Zang Kai Zheng Yang Song Jun Xu Xiao Zhang Weijie Yu Yang Song Han Li 46 6 0 15 Oct 2024
Efficiently Deploying LLMs with Controlled Risk Michael J. Zellinger Matt Thomson 36 1 0 03 Oct 2024
Integrative Decoding: Improve Factuality via Implicit Self-consistency Yi Cheng Xiao Liang Yeyun Gong Wen Xiao Song Wang ... Wenjie Li Jian Jiao Qi Chen Peng Cheng Wayne Xiong HILM 50 1 0 02 Oct 2024
A Survey on the Honesty of Large Language Models Siheng Li Cheng Yang Taiqiang Wu Chufan Shi Yuji Zhang ... Jie Zhou Yujiu Yang Ngai Wong Xixin Wu Wai Lam HILM 27 4 0 27 Sep 2024
Do LLMs Know When to NOT Answer? Investigating Abstention Abilities of Large Language Models Nishanth Madhusudhan Sathwik Tejaswi Madhusudhan Vikas Yadav Masoud Hashemi 14 4 0 23 Jul 2024
On the Limitations of Compute Thresholds as a Governance Strategy Sara Hooker 42 14 0 08 Jul 2024
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets Samuel Marks Max Tegmark HILM 91 164 0 10 Oct 2023
Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness Jiuhai Chen Jonas W. Mueller 42 55 0 30 Aug 2023
Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources Xingxuan Li Ruochen Zhao Yew Ken Chia Bosheng Ding Shafiq R. Joty Soujanya Poria Lidong Bing HILM BDL LRM 79 85 0 22 May 2023
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models Potsawee Manakul Adian Liusie Mark J. F. Gales HILM LRM 147 386 0 15 Mar 2023
Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization Mengyao Cao Yue Dong Jackie C.K. Cheung HILM 170 144 0 30 Aug 2021
Probing Classifiers: Promises, Shortcomings, and Advances Yonatan Belinkov 221 402 0 24 Feb 2021
Reducing conversational agents' overconfidence through linguistic calibration Sabrina J. Mielke Arthur Szlam Emily Dinan Y-Lan Boureau 197 152 0 30 Dec 2020
Language Models as Knowledge Bases? Fabio Petroni Tim Rocktaschel Patrick Lewis A. Bakhtin Yuxiang Wu Alexander H. Miller Sebastian Riedel KELM AI4MH 393 2,576 0 03 Sep 2019