首页|Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games

Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games

来源：

英文摘要

The increasing proliferation of small UAVs in civilian and military airspace has raised critical safety and security concerns, especially when unauthorized or malicious drones enter restricted zones. In this work, we present a reinforcement learning (RL) framework for agile 1v1 quadrotor pursuit-evasion. We train neural network policies to command body rates and collective thrust, enabling high-speed pursuit and evasive maneuvers that fully exploit the quadrotor's nonlinear dynamics. To mitigate nonstationarity and catastrophic forgetting during adversarial co-training, we introduce an Asynchronous Multi-Stage Population-Based (AMSPB) algorithm where, at each stage, either the pursuer or evader learns against a sampled opponent drawn from a growing population of past and current policies. This continual learning setup ensures monotonic performance improvement and retention of earlier strategies. Our results show that (i) rate-based policies achieve significantly higher capture rates and peak speeds than velocity-level baselines, and (ii) AMSPB yields stable, monotonic gains against a suite of benchmark opponents.

作者：Alejandro Sanchez Roncero、Olov Andersson、Petter Ogren

作者单位：

学科分类：自动化技术、自动化技术设备计算技术、计算机技术航空航天技术

推荐引用：Alejandro Sanchez Roncero,Olov Andersson,Petter Ogren.Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games[EB/OL].(2025-06-03)[2025-06-16].https://arxiv.org/abs/2506.02849.点此复制

Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games

Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games

评论