首页|Text Generation Beyond Discrete Token Sampling

Text Generation Beyond Discrete Token Sampling

来源：

英文摘要

In standard autoregressive generation, an LLM predicts the next-token distribution, samples a discrete token, and then discards the distribution, passing only the sampled token as new input. To preserve this distribution's rich information, we propose Mixture of Inputs (MoI), a training-free method for autoregressive generation. After generating a token following the standard paradigm, we construct a new input that blends the generated discrete token with the previously discarded token distribution. Specifically, we employ a Bayesian estimation method that treats the token distribution as the prior, the sampled token as the observation, and replaces the conventional one-hot vector with the continuous posterior expectation as the new model input. MoI allows the model to maintain a richer internal representation throughout the generation process, resulting in improved text quality and reasoning capabilities. On mathematical reasoning, code generation, and PhD-level QA tasks, MoI consistently improves performance across multiple models including QwQ-32B, Nemotron-Super-49B, Gemma-3-27B, and DAPO-Qwen-32B, with no additional training and negligible computational overhead.

作者：Yufan Zhuang、Liyuan Liu、Chandan Singh、Jingbo Shang、Jianfeng Gao

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Yufan Zhuang,Liyuan Liu,Chandan Singh,Jingbo Shang,Jianfeng Gao.Text Generation Beyond Discrete Token Sampling[EB/OL].(2025-05-20)[2025-06-19].https://arxiv.org/abs/2505.14827.点此复制

Text Generation Beyond Discrete Token Sampling

Text Generation Beyond Discrete Token Sampling

评论