首页|Theoretical Foundations and Mitigation of Hallucination in Large Language Models

Theoretical Foundations and Mitigation of Hallucination in Large Language Models

来源：

英文摘要

Hallucination in Large Language Models (LLMs) refers to the generation of content that is not faithful to the input or the real-world facts. This paper provides a rigorous treatment of hallucination in LLMs, including formal definitions and theoretical analyses. We distinguish between intrinsic and extrinsic hallucinations, and define a \textit{hallucination risk} for models. We derive bounds on this risk using learning-theoretic frameworks (PAC-Bayes and Rademacher complexity). We then survey detection strategies for hallucinations, such as token-level uncertainty estimation, confidence calibration, and attention alignment checks. On the mitigation side, we discuss approaches including retrieval-augmented generation, hallucination-aware fine-tuning, logit calibration, and the incorporation of fact-verification modules. We propose a unified detection and mitigation workflow, illustrated with a diagram, to integrate these strategies. Finally, we outline evaluation protocols for hallucination, recommending datasets, metrics, and experimental setups to quantify and reduce hallucinations. Our work lays a theoretical foundation and practical guidelines for addressing the crucial challenge of hallucination in LLMs.

作者：Esmail Gumaan

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Esmail Gumaan.Theoretical Foundations and Mitigation of Hallucination in Large Language Models[EB/OL].(2025-07-20)[2025-08-07].https://arxiv.org/abs/2507.22915.点此复制

Theoretical Foundations and Mitigation of Hallucination in Large Language Models

Theoretical Foundations and Mitigation of Hallucination in Large Language Models

评论