GAWM: Global-Aware World Model for Multi-Agent Reinforcement Learning
GAWM: Global-Aware World Model for Multi-Agent Reinforcement Learning
计算技术、计算机技术
Shanling Dong,Senlin Zhang,Ping Wei,Meiqin Liu,Zifeng Shi,Ronghao Zheng.GAWM: Global-Aware World Model for Multi-Agent Reinforcement Learning[EB/OL].(2025-01-17)[2025-10-28].https://arxiv.org/abs/2501.10116.点此复制
In recent years, Model-based Multi-Agent Reinforcement Learning (MARL) has
demonstrated significant advantages over model-free methods in terms of sample
efficiency by using independent environment dynamics world models for data
sample augmentation. However, without considering the limited sample size,
these methods still lag behind model-free methods in terms of final convergence
performance and stability. This is primarily due to the world model's
insufficient and unstable representation of global states in partially
observable environments. This limitation hampers the ability to ensure global
consistency in the data samples and results in a time-varying and unstable
distribution mismatch between the pseudo data samples generated by the world
model and the real samples. This issue becomes particularly pronounced in more
complex multi-agent environments. To address this challenge, we propose a
model-based MARL method called GAWM, which enhances the centralized world
model's ability to achieve globally unified and accurate representation of
state information while adhering to the CTDE paradigm. GAWM uniquely leverages
an additional Transformer architecture to fuse local observation information
from different agents, thereby improving its ability to extract and represent
global state information. This enhancement not only improves sample efficiency
but also enhances training stability, leading to superior convergence
performance, particularly in complex and challenging multi-agent environments.
This advancement enables model-based methods to be effectively applied to more
complex multi-agent environments. Experimental results demonstrate that GAWM
outperforms various model-free and model-based approaches, achieving
exceptional performance in the challenging domains of SMAC.
展开英文信息

评论