首页|基于深度强化学习的无人机自主导航应用研究

基于深度强化学习的无人机自主导航应用研究

Research on the application of UAV autonomous navigation based on DRL

来源：

中文摘要

英文摘要

为了解决传统无人机自主导航算法存在的实现流程准确率较低、构建高精度地图消耗大量计算资源等问题，本文基于深度学习和强化学习的方法提出了一种端到端实现、无需构建地图先验的新型导航算法。首先基于Airsim和UE4平台，设计并搭建了高保真和高性能的仿真环境，用于训练无人机的自主导航任务。该仿真环境支持域随机化，支持gym接口，可以实现photo-realistic级别的信息质量。然后利用强化学习思想对无人机自主导航任务进行系统建模，通过设计状态空间和动作空间，制定奖励函数等过程设计并训练针对自主导航任务的网络模型，并对基于演员家-评论家（actor-critic）框架的PPO算法中的单个演员家适用场景简单的问题进行优化，设计了一种融合注意力机制的多演员家-单评论家PPO算法网络结构。最后通过仿真实验验证了该方法的可行性和有效性。

In order to solve the problems of traditional UAV autonomous navigation algorithms, such as low accuracy of the implementation and the consumption of extensive computational resources for constructing high-precision maps, a new navigation algorithm based on deep learning and reinforcement learning methods was proposed, with end-to-end implementation and no need to construct map a priori. Firstly, a high-fidelity and high-performance simulation environment is designed and built based on Airsim and UE4 platforms to train UAV\'s autonomous navigation tasks. It supports domain randomization and gym interface to achieve a photo-realistic level of information quality. Secondly, using reinforcement learning ideas, a network model for autonomous UAV navigation tasks is designed and trained by designing the state and action space and formulating the reward function. A multi-actor-single-critic PPO algorithm network structure incorporating an attention mechanism was designed by optimizing the problem of over-simplified scenarios for single-actor in an actor-critic framework-based PPO algorithm. Finally, simulation experiments verify the method\'s feasibility and effectiveness.

作者：魏世民、马云鹏

作者单位：

学科分类：无线电导航航空航天技术自动化技术、自动化技术设备

中文关键词：计算机应用技术无人机深度强化学习注意力机制

英文关键词：omputer Application TechnologyUnmanned Aerial VehicleDeep Reinforcement LearningAttention Mechanism

推荐引用：魏世民,马云鹏.基于深度强化学习的无人机自主导航应用研究[EB/OL].(2023-03-09)[2025-08-02].http://www.paper.edu.cn/releasepaper/content/202303-102.点此复制