|国家预印本平台
首页|A finite time analysis of distributed Q-learning

A finite time analysis of distributed Q-learning

A finite time analysis of distributed Q-learning

来源:Arxiv_logoArxiv
英文摘要

Multi-agent reinforcement learning (MARL) has witnessed a remarkable surge in interest, fueled by the empirical success achieved in applications of single-agent reinforcement learning (RL). In this study, we consider a distributed Q-learning scenario, wherein a number of agents cooperatively solve a sequential decision making problem without access to the central reward function which is an average of the local rewards. In particular, we study finite-time analysis of a distributed Q-learning algorithm, and provide a new sample complexity result of $\tilde{\mathcal{O}}\left( \min\left\{\frac{1}{ε^2}\frac{t_{\text{mix}}}{(1-γ)^6 d_{\min}^4 } ,\frac{1}ε\frac{\sqrt{|\gS||\gA|}}{(1-σ_2(\boldsymbol{W}))(1-γ)^4 d_{\min}^3} \right\}\right)$ under tabular lookup

Han-Dong Lim、Donghwan Lee

计算技术、计算机技术

Han-Dong Lim,Donghwan Lee.A finite time analysis of distributed Q-learning[EB/OL].(2025-07-29)[2025-08-11].https://arxiv.org/abs/2405.14078.点此复制

评论