On Equivalence Between Decentralized Policy-Profile Mixtures and Behavioral Coordination Policies in Multi-Agent Systems
On Equivalence Between Decentralized Policy-Profile Mixtures and Behavioral Coordination Policies in Multi-Agent Systems
Constrained decentralized team problem formulations are good models for many cooperative multi-agent systems. Constraints necessitate randomization when solving for optimal solutions -- our past results show that joint randomization amongst the team is necessary for (strong) Lagrangian duality to hold -- , but a better understanding of randomization still remains. For a partially observed multi-agent system with Borel hidden state and finite observations and actions, we prove the equivalence between joint mixtures of decentralized policy-profiles (both pure and behavioral) and common-information based behavioral coordination policies (also mixtures of them). This generalizes past work that shows equivalence between pure decentralized policy-profiles and pure coordination policies. The equivalence can be exploited to develop results on strong duality and number of randomizations.
Nouman Khan、Vijay G. Subramanian
自动化基础理论
Nouman Khan,Vijay G. Subramanian.On Equivalence Between Decentralized Policy-Profile Mixtures and Behavioral Coordination Policies in Multi-Agent Systems[EB/OL].(2025-04-17)[2025-05-29].https://arxiv.org/abs/2504.12635.点此复制
评论