Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
LLMs often adopt an assertive language style also when making false claims. Such ``overconfident hallucinations'' mislead users and erode trust. Achieving the ability to express in language the actual degree of uncertainty around a claim is therefore of great importance. We find that ``verbal uncertainty'' is governed by a single linear feature in the representation space of LLMs, and show that this has only moderate correlation with the actual ``semantic uncertainty'' of the model. We apply this insight and show that (1) the mismatch between semantic and verbal uncertainty is a better predictor of hallucinations than semantic uncertainty alone and (2) we can intervene on verbal uncertainty at inference time and reduce confident hallucinations on short-form answers, achieving an average relative reduction of ~30%.
Ziwei Ji、Lei Yu、Yeskendir Koishekenov、Yejin Bang、Anthony Hartshorn、Alan Schelten、Cheng Zhang、Pascale Fung、Nicola Cancedda
计算技术、计算机技术
Ziwei Ji,Lei Yu,Yeskendir Koishekenov,Yejin Bang,Anthony Hartshorn,Alan Schelten,Cheng Zhang,Pascale Fung,Nicola Cancedda.Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations[EB/OL].(2025-03-18)[2025-05-15].https://arxiv.org/abs/2503.14477.点此复制
评论