|国家预印本平台
首页|Corrector Sampling in Language Models

Corrector Sampling in Language Models

Corrector Sampling in Language Models

来源:Arxiv_logoArxiv
英文摘要

Autoregressive language models accumulate errors due to their fixed, irrevocable left-to-right token generation. To address this, we propose a new sampling method called Resample-Previous-Tokens (RPT). RPT mitigates error accumulation by iteratively revisiting and potentially replacing tokens in a window of previously generated text. This method can be integrated into existing autoregressive models, preserving their next-token-prediction quality and speed. Fine-tuning a pretrained 8B parameter model with RPT for only 100B resulted in ~10% relative improvements on reasoning and coding benchmarks compared to the standard sampling.

Itai Gat、Neta Shaul、Uriel Singer、Yaron Lipman

计算技术、计算机技术

Itai Gat,Neta Shaul,Uriel Singer,Yaron Lipman.Corrector Sampling in Language Models[EB/OL].(2025-06-06)[2025-07-02].https://arxiv.org/abs/2506.06215.点此复制

评论