首页|A finite time analysis of distributed Q-learning

A finite time analysis of distributed Q-learning

来源：

英文摘要

Multi-agent reinforcement learning (MARL) has witnessed a remarkable surge in interest, fueled by the empirical success achieved in applications of single-agent reinforcement learning (RL). In this study, we consider a distributed Q-learning scenario, wherein a number of agents cooperatively solve a sequential decision making problem without access to the central reward function which is an average of the local rewards. In particular, we study finite-time analysis of a distributed Q-learning algorithm, and provide a new sample complexity result of $\tilde{\mathcal{O}}\left( \min\left\{\frac{1}{Îµ^2}\frac{t_{\text{mix}}}{(1-Î³)^6 d_{\min}^4 } ,\frac{1}Îµ\frac{\sqrt{|\gS||\gA|}}{(1-Ï_2(\boldsymbol{W}))(1-Î³)^4 d_{\min}^3} \right\}\right)$ under tabular lookup

作者：Han-Dong Lim、Donghwan Lee

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Han-Dong Lim,Donghwan Lee.A finite time analysis of distributed Q-learning[EB/OL].(2025-07-29)[2025-08-11].https://arxiv.org/abs/2405.14078.点此复制

A finite time analysis of distributed Q-learning

A finite time analysis of distributed Q-learning

评论