Constructing Benchmarks and Interventions for Combating Hallucinations
in LLMs

Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs

15 April 2024

Jonathan Herzig

Yonatan Belinkov

Papers citing "Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs"

10 / 10 papers shown

Title
Integrative Decoding: Improve Factuality via Implicit Self-consistency Yi Cheng Xiao Liang Yeyun Gong Wen Xiao Song Wang ... Wenjie Li Jian Jiao Qi Chen Peng Cheng Wayne Xiong HILM 50 1 0 02 Oct 2024
Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models Hongbang Yuan Pengfei Cao Zhuoran Jin Yubo Chen Daojian Zeng Kang Liu Jun Zhao HILM 24 3 0 29 Feb 2024
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension Fan Yin Jayanth Srinivasa Kai-Wei Chang HILM 52 19 0 28 Feb 2024
Towards Understanding Sycophancy in Language Models Mrinank Sharma Meg Tong Tomasz Korbak D. Duvenaud Amanda Askell ... Oliver Rausch Nicholas Schiefer Da Yan Miranda Zhang Ethan Perez 209 178 0 20 Oct 2023
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets Samuel Marks Max Tegmark HILM 91 164 0 10 Oct 2023
How Language Model Hallucinations Can Snowball Muru Zhang Ofir Press William Merrill Alisa Liu Noah A. Smith HILM LRM 78 246 0 22 May 2023
The Internal State of an LLM Knows When It's Lying A. Azaria Tom Michael Mitchell HILM 213 297 0 26 Apr 2023
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models Potsawee Manakul Adian Liusie Mark J. F. Gales HILM LRM 150 386 0 15 Mar 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022
Similarity Analysis of Contextual Word Representation Models John M. Wu Yonatan Belinkov Hassan Sajjad Nadir Durrani Fahim Dalvi James R. Glass 46 73 0 03 May 2020