Still No Lie Detector for Language Models: Probing Empirical and Conceptual Roadblocks

30 June 2023

Papers citing "Still No Lie Detector for Language Models: Probing Empirical and Conceptual Roadblocks"

5 / 5 papers shown

Title
Prompt-Guided Internal States for Hallucination Detection of Large Language Models Fujie Zhang Peiqi Yu Biao Yi Baolei Zhang Tong Li Zheli Liu HILM LRM 50 0 0 07 Nov 2024
Does ChatGPT Have a Mind? Simon Goldstein B. Levinstein AI4MH LRM 27 5 0 27 Jun 2024
Standards for Belief Representations in LLMs Daniel A. Herrmann B. Levinstein 34 6 0 31 May 2024
The Internal State of an LLM Knows When It's Lying A. Azaria Tom Michael Mitchell HILM 216 299 0 26 Apr 2023
Truthful AI: Developing and governing AI that does not lie Owain Evans Owen Cotton-Barratt Lukas Finnveden Adam Bales Avital Balwit Peter Wills Luca Righetti William Saunders HILM 228 109 0 13 Oct 2021