首页|CodeReasoner: Enhancing the Code Reasoning Ability with Reinforcement Learning

CodeReasoner: Enhancing the Code Reasoning Ability with Reinforcement Learning

来源：

英文摘要

Code reasoning is a fundamental capability for large language models (LLMs) in the code domain. It involves understanding and predicting a program's execution behavior, such as determining the output for a given input or whether a specific statement will be executed. This capability is essential for downstream tasks like debugging, code generation, and program repair. Prior approaches mainly rely on supervised fine-tuning to improve performance in code reasoning tasks. However, they often show limited gains and fail to generalize across diverse scenarios. We argue this is due to two core issues: the low quality of training data and the limitations of supervised fine-tuning, which struggles to teach general reasoning skills. To address these challenges, we propose CodeReasoner, a framework that spans both dataset construction and a two-stage training process. First, we introduce a method to construct datasets that focus on the core execution logic of Python programs. Next, we apply instruction tuning to inject execution-specific knowledge distilled from a powerful teacher model. We then enhance reasoning and generalization through GRPO reinforcement learning on top of the fine-tuned model. Experiments on three widely-used code reasoning benchmarks show that CodeReasoner improves performance by 27.1% to 40.2% over prior methods using a 7B model. Notably, the 7B model matches GPT-4o on key tasks like input/output and coverage prediction. When scaled to 14B, CodeReasoner outperforms GPT-4o across all benchmarks. Ablation studies confirm the effectiveness of each training stage and highlight the importance of reasoning chains.

作者：Lingxiao Tang、He Ye、Zhongxin Liu、Xiaoxue Ren、Lingfeng Bao

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Lingxiao Tang,He Ye,Zhongxin Liu,Xiaoxue Ren,Lingfeng Bao.CodeReasoner: Enhancing the Code Reasoning Ability with Reinforcement Learning[EB/OL].(2025-07-23)[2025-08-23].https://arxiv.org/abs/2507.17548.点此复制

CodeReasoner: Enhancing the Code Reasoning Ability with Reinforcement Learning

CodeReasoner: Enhancing the Code Reasoning Ability with Reinforcement Learning

评论