Dynamic Early Exit in Reasoning Models
Dynamic Early Exit in Reasoning Models
Recent advances in large reasoning language models (LRLMs) rely on test-time scaling, which extends long chain-of-thought (CoT) generation to solve complex tasks. However, overthinking in long CoT not only slows down the efficiency of problem solving, but also risks accuracy loss due to the extremely detailed or redundant reasoning steps. We propose a simple yet effective method that allows LLMs to self-truncate CoT sequences by early exit during generation. Instead of relying on fixed heuristics, the proposed method monitors model behavior at potential reasoning transition points (e.g.,"Wait" tokens) and dynamically terminates the next reasoning chain's generation when the model exhibits high confidence in a trial answer. Our method requires no additional training and can be seamlessly integrated into existing o1-like reasoning LLMs. Experiments on 10 reasoning benchmarks (e.g., GSM8K, MATH-500, AMC, GPQA, AIME and LiveCodeBench) show that the proposed method is consistently effective on 11 cutting-edge reasoning LLMs of varying series and sizes, reducing the length of CoT sequences by an average of 19.1% to 80.1% while improving accuracy by 0.3% to 5.0%.
Weiping Wang、Chenxu Yang、Qingyi Si、Yongjie Duan、Zheliang Zhu、Chenyu Zhu、Qiaowei Li、Zheng Lin、Li Cao
计算技术、计算机技术
Weiping Wang,Chenxu Yang,Qingyi Si,Yongjie Duan,Zheliang Zhu,Chenyu Zhu,Qiaowei Li,Zheng Lin,Li Cao.Dynamic Early Exit in Reasoning Models[EB/OL].(2025-04-22)[2025-06-03].https://arxiv.org/abs/2504.15895.点此复制
评论