|国家预印本平台
首页|Unraveling the Rainbow: can value-based methods schedule?

Unraveling the Rainbow: can value-based methods schedule?

Unraveling the Rainbow: can value-based methods schedule?

来源:Arxiv_logoArxiv
英文摘要

Recently, deep reinforcement learning has emerged as a promising approach for solving complex combinatorial optimization problems. Broadly, deep reinforcement learning methods fall into two categories: policy-based and value-based. While value-based approaches have achieved notable success in domains such as the Arcade Learning Environment, the combinatorial optimization community has predominantly favored policy-based methods, often overlooking the potential of value-based algorithms. In this work, we conduct a comprehensive empirical evaluation of value-based algorithms, including the deep q-network and several of its advanced extensions, within the context of two complex combinatorial problems: the job-shop and the flexible job-shop scheduling problems, two fundamental challenges with multiple industrial applications. Our results challenge the assumption that policy-based methods are inherently superior for combinatorial optimization. We show that several value-based approaches can match or even outperform the widely adopted proximal policy optimization algorithm, suggesting that value-based strategies deserve greater attention from the combinatorial optimization community. Our code is openly available at: https://github.com/AJ-Correa/Unraveling-the-Rainbow.

Arthur Corrêa、Alexandre Jesus、Cristóv?o Silva、Samuel Moniz

自动化基础理论自动化技术、自动化技术设备计算技术、计算机技术

Arthur Corrêa,Alexandre Jesus,Cristóv?o Silva,Samuel Moniz.Unraveling the Rainbow: can value-based methods schedule?[EB/OL].(2025-05-06)[2025-05-26].https://arxiv.org/abs/2505.03323.点此复制

评论