首页|Localizing Factual Inconsistencies in Attributable Text Generation

Localizing Factual Inconsistencies in Attributable Text Generation

来源：

英文摘要

There has been an increasing interest in detecting hallucinations in model-generated texts, both manually and automatically, at varying levels of granularity. However, most existing methods fail to precisely pinpoint the errors. In this work, we introduce QASemConsistency, a new formalism for localizing factual inconsistencies in attributable text generation, at a fine-grained level. Drawing inspiration from Neo-Davidsonian formal semantics, we propose decomposing the generated text into minimal predicate-argument level propositions, expressed as simple question-answer (QA) pairs, and assess whether each individual QA pair is supported by a trusted reference text. As each QA pair corresponds to a single semantic relation between a predicate and an argument, QASemConsistency effectively localizes the unsupported information. We first demonstrate the effectiveness of the QASemConsistency methodology for human annotation, by collecting crowdsourced annotations of granular consistency errors, while achieving a substantial inter-annotator agreement. This benchmark includes more than 3K instances spanning various tasks of attributable text generation. We also show that QASemConsistency yields factual consistency scores that correlate well with human judgments. Finally, we implement several methods for automatically detecting localized factual inconsistencies, with both supervised entailment models and LLMs.

作者：Arie Cattan、Paul Roit、Shiyue Zhang、David Wan、Roee Aharoni、Idan Szpektor、Mohit Bansal、Ido Dagan

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Arie Cattan,Paul Roit,Shiyue Zhang,David Wan,Roee Aharoni,Idan Szpektor,Mohit Bansal,Ido Dagan.Localizing Factual Inconsistencies in Attributable Text Generation[EB/OL].(2025-08-24)[2025-09-05].https://arxiv.org/abs/2410.07473.点此复制

Localizing Factual Inconsistencies in Attributable Text Generation

Localizing Factual Inconsistencies in Attributable Text Generation

评论