First Hallucination Tokens Are Different from Conditional Ones

28 July 2025

Jakob Snel

Seong Joon Oh

HILM

ArXiv (abs)PDF HTML Github

Main:4 Pages

39 Figures

Bibliography:3 Pages

1 Tables

Appendix:37 Pages

Abstract

Hallucination, the generation of untruthful content, is one of the major concerns regarding foundational models. Detecting hallucinations at the token level is vital for real-time filtering and targeted correction, yet the variation of hallucination signals within token sequences is not fully understood. Leveraging the RAGTruth corpus with token-level annotations and reproduced logits, we analyse how these signals depend on a token's position within hallucinated spans, contributing to an improved understanding of token-level hallucination. Our results show that the first hallucinated token carries a stronger signal and is more detectable than conditional tokens. We release our analysis framework, along with code for logit reproduction and metric computation at this https URL\_Xtended.

View on arXiv

Comments on this paper