|国家预印本平台
首页|First Hallucination Tokens Are Different from Conditional Ones

First Hallucination Tokens Are Different from Conditional Ones

First Hallucination Tokens Are Different from Conditional Ones

来源:Arxiv_logoArxiv
英文摘要

Hallucination, the generation of untruthful content, is one of the major concerns regarding foundational models. Detecting hallucinations at the token level is vital for real-time filtering and targeted correction, yet the variation of hallucination signals within token sequences is not fully understood. Leveraging the RAGTruth corpus with token-level annotations and reproduced logits, we analyse how these signals depend on a token's position within hallucinated spans, contributing to an improved understanding of token-level hallucination. Our results show that the first hallucinated token carries a stronger signal and is more detectable than conditional tokens. We release our analysis framework, along with code for logit reproduction and metric computation at https://github.com/jakobsnl/RAGTruth_Xtended.

Jakob Snel、Seong Joon Oh

计算技术、计算机技术

Jakob Snel,Seong Joon Oh.First Hallucination Tokens Are Different from Conditional Ones[EB/OL].(2025-07-28)[2025-08-18].https://arxiv.org/abs/2507.20836.点此复制

评论