StepFun-Prover Preview: Let's Think and Verify Step by Step
StepFun-Prover Preview: Let's Think and Verify Step by Step
We present StepFun-Prover Preview, a large language model designed for formal theorem proving through tool-integrated reasoning. Using a reinforcement learning pipeline that incorporates tool-based interactions, StepFun-Prover can achieve strong performance in generating Lean 4 proofs with minimal sampling. Our approach enables the model to emulate human-like problem-solving strategies by iteratively refining proofs based on real-time environment feedback. On the miniF2F-test benchmark, StepFun-Prover achieves a pass@1 success rate of $70.0\%$. Beyond advancing benchmark performance, we introduce an end-to-end training framework for developing tool-integrated reasoning models, offering a promising direction for automated theorem proving and Math AI assistant.
Shijie Shang、Ruosi Wan、Yue Peng、Yutong Wu、Xiong-hui Chen、Jie Yan、Xiangyu Zhang
计算技术、计算机技术
Shijie Shang,Ruosi Wan,Yue Peng,Yutong Wu,Xiong-hui Chen,Jie Yan,Xiangyu Zhang.StepFun-Prover Preview: Let's Think and Verify Step by Step[EB/OL].(2025-07-27)[2025-08-10].https://arxiv.org/abs/2507.20199.点此复制
评论