|国家预印本平台
首页|AlphaZero-Edu: Making AlphaZero Accessible to Everyone

AlphaZero-Edu: Making AlphaZero Accessible to Everyone

AlphaZero-Edu: Making AlphaZero Accessible to Everyone

来源:Arxiv_logoArxiv
英文摘要

Recent years have witnessed significant progress in reinforcement learning, especially with Zero-like paradigms, which have greatly boosted the generalization and reasoning abilities of large-scale language models. Nevertheless, existing frameworks are often plagued by high implementation complexity and poor reproducibility. To tackle these challenges, we present AlphaZero-Edu, a lightweight, education-focused implementation built upon the mathematical framework of AlphaZero. It boasts a modular architecture that disentangles key components, enabling transparent visualization of the algorithmic processes. Additionally, it is optimized for resource-efficient training on a single NVIDIA RTX 3090 GPU and features highly parallelized self-play data generation, achieving a 3.2-fold speedup with 8 processes. In Gomoku matches, the framework has demonstrated exceptional performance, achieving a consistently high win rate against human opponents. AlphaZero-Edu has been open-sourced at https://github.com/StarLight1212/AlphaZero_Edu, providing an accessible and practical benchmark for both academic research and industrial applications.

Binjie Guo、Ru Zhang、Haohan Jiang、Xurong Lin、Hongyan Wei、Aisheng Mo、Jie Li、Zhiyuan Qian、Zhuhao Zhang、Xiaoyuan Cheng、Hanyu Zheng、Guowei Su

计算技术、计算机技术

Binjie Guo,Ru Zhang,Haohan Jiang,Xurong Lin,Hongyan Wei,Aisheng Mo,Jie Li,Zhiyuan Qian,Zhuhao Zhang,Xiaoyuan Cheng,Hanyu Zheng,Guowei Su.AlphaZero-Edu: Making AlphaZero Accessible to Everyone[EB/OL].(2025-04-20)[2025-05-02].https://arxiv.org/abs/2504.14636.点此复制

评论