Integrated Control and Active Perception in POMDPs for Temporal Logic Tasks and Information Acquisition
Integrated Control and Active Perception in POMDPs for Temporal Logic Tasks and Information Acquisition
This paper studies the synthesis of a joint control and active perception policy for a stochastic system modeled as a partially observable Markov decision process (POMDP), subject to temporal logic specifications. The POMDP actions influence both system dynamics (control) and the emission function (perception). Beyond task completion, the planner seeks to maximize information gain about certain temporal events (the secret) through coordinated perception and control. To enable active information acquisition, we introduce minimizing the Shannon conditional entropy of the secret as a planning objective, alongside maximizing the probability of satisfying the temporal logic formula within a finite horizon. Using a variant of observable operators in hidden Markov models (HMMs) and POMDPs, we establish key properties of the conditional entropy gradient with respect to policy parameters. These properties facilitate efficient policy gradient computation. We validate our approach through graph-based examples, inspired by common security applications with UAV surveillance.
Chongyang Shi、Michael R. Dorothy、Jie Fu
航空航天技术自动化技术、自动化技术设备
Chongyang Shi,Michael R. Dorothy,Jie Fu.Integrated Control and Active Perception in POMDPs for Temporal Logic Tasks and Information Acquisition[EB/OL].(2025-04-17)[2025-04-30].https://arxiv.org/abs/2504.13288.点此复制
评论