拒止环境下基于深度强化学习的多无人机协同定位研究

  • 万开方 ,
  • 吴志林 ,
  • 武韫晖 ,
  • 强皓植 ,
  • 吴艺博 ,
  • 李波
展开
  • 西北工业大学

收稿日期: 2024-08-01

  修回日期: 2024-12-03

  网络出版日期: 2024-12-05

基金资助

国家自然科学基金项目;陕西省重点研发计划;中央高校基本科研业务费专项资金资助;电磁空间作战与应用重点实验室资助

Scheduling Cooperative Location of Multiple UAVs with Deep Reinforcement Learning in GPS-denied Environment

  • WAN Kai-Fang ,
  • WU Zhi-Lin ,
  • WU Yun-Hui ,
  • QIANG Hao-Zhi ,
  • WU Yi-Bo ,
  • LI Bo
Expand

Received date: 2024-08-01

  Revised date: 2024-12-03

  Online published: 2024-12-05

摘要

为解决强对抗场景下无人机因遭受干扰而导致GPS失能无法精确获取自身定位的问题,考虑到无人机经常以编队或集群形式行动,提出一种依靠编队内的无人机相互测量相对空间位置并互为定位的方法,使无人机在GPS信号丢失后依然可以实时更新自身位置。针对GPS拒止环境,引入部分可观测马尔可夫决策过程(POMDP)理论,分析了POMDP模型要素,建立起协同定位调度的POMDP决策模型。提出了基于扩展卡尔曼滤波(EKF)的信念状态更新方法和基于深度强化学习中深度Q网络(Deep Q-Network,DQN)的Q值估计方法,以实现协同实时精确定位。不同场景下的应用测试表明,所建立的模型能够实现编队中GPS正常无人机的高效管理调度,能够控制GPS正常无人机对GPS失效无人机进行有效协同定位,即模型有效性得到了验证。

本文引用格式

万开方 , 吴志林 , 武韫晖 , 强皓植 , 吴艺博 , 李波 . 拒止环境下基于深度强化学习的多无人机协同定位研究[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2024.31024

Abstract

In strong adversarial scenarios,?unmanned?aerial vehicles (UAVs) often experience GPS malfunction due to interference, making it difficult to obtain their positioning accurately. Since UAVs often operate in formations or clusters, this paper proposed a strategy that relies on drones within the formation to measure relative spatial positions and locate each other, allowing UAVs to update their position information in real time, even after GPS signal loss. Firstly, in response to the GPS-denied environment, this paper introduced the theory of the Partially Observable Markov Decision Process (POMDP), analyzed the model elements of POMDP, and establishes a collaborative positioning and scheduling POMDP decision model. Additionally, this paper proposed a belief state update method based on the Extended Kalman Filter (EKF), as well as a Q-value estimation method based on Deep Q-Network (DQN) in deep reinforcement learning to achieve accurate collaborative real-time positioning. Application tests in different scenarios have shown that the proposed model can achieve efficient management and scheduling of unmanned aerial vehicles in formation by effectively coordinating the positioning of GPS failure unmanned aerial vehicles with good control performance.
文章导航

/