Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games
Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games
The increasing proliferation of small UAVs in civilian and military airspace has raised critical safety and security concerns, especially when unauthorized or malicious drones enter restricted zones. In this work, we present a reinforcement learning (RL) framework for agile 1v1 quadrotor pursuit-evasion. We train neural network policies to command body rates and collective thrust, enabling high-speed pursuit and evasive maneuvers that fully exploit the quadrotor's nonlinear dynamics. To mitigate nonstationarity and catastrophic forgetting during adversarial co-training, we introduce an Asynchronous Multi-Stage Population-Based (AMSPB) algorithm where, at each stage, either the pursuer or evader learns against a sampled opponent drawn from a growing population of past and current policies. This continual learning setup ensures monotonic performance improvement and retention of earlier strategies. Our results show that (i) rate-based policies achieve significantly higher capture rates and peak speeds than velocity-level baselines, and (ii) AMSPB yields stable, monotonic gains against a suite of benchmark opponents.
Alejandro Sanchez Roncero、Olov Andersson、Petter Ogren
自动化技术、自动化技术设备计算技术、计算机技术航空航天技术
Alejandro Sanchez Roncero,Olov Andersson,Petter Ogren.Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games[EB/OL].(2025-06-03)[2025-06-16].https://arxiv.org/abs/2506.02849.点此复制
评论