基于对抗强化学习的围捕背景下无人机逃离路径规划方法

  • 黄湘松 ,
  • 王梦宇 ,
  • 潘大鹏
展开
  • 哈尔滨工程大学信息与通信工程学院

收稿日期: 2024-12-09

  修回日期: 2025-04-01

  网络出版日期: 2025-04-07

Adversarial Reinforcement Learning-based UAV Escape Path Planning Method in the Context of Roundups

  • HUANG Xiang-Song ,
  • WANG Meng-Yu ,
  • PAN Da-Peng
Expand

Received date: 2024-12-09

  Revised date: 2025-04-01

  Online published: 2025-04-07

摘要

在无人机技术迅速发展的背景下,如何应对其他无人机的恶意追捕成为了无人机安全防护中的重要课题。本研究旨在通过使用对抗强化学习算法,提升无人机在敌对环境中的适应性和生存能力。针对这一问题,本研究利用对抗强化学习框架,针对无人机逃逸过程中接收错误信息对决策产生干扰的问题进行了处理,以围捕者与逃逸者之间的对抗性交互为基础,优化逃逸者的策略来对抗围捕者的行为,同时,针对传统的强化学习方法中的稀疏奖励问题,本文结合人工势场法提出逐步奖励函数,使得无人机可以更好地适应围捕环境。研究表明,该算法相比于PPO算法,无人机的逃逸成功率提升了18.14%,同时运输时间减少了52.2%,显著提高了无人机的运输效率。本文的研究为无人机的安全防护提供了新的技术方案,并探索了对抗强化学习在恶意追捕情境下的应用潜力。

本文引用格式

黄湘松 , 王梦宇 , 潘大鹏 . 基于对抗强化学习的围捕背景下无人机逃离路径规划方法[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2024.31637

Abstract

In the context of the rapid development of drone technology, how to deal with malicious pursuit by other drones has become an important issue in drone security protection. This study aims to enhance the adaptability and survival ability of drones in hostile environments through adversarial reinforcement learning algorithms. To address this issue, this study adopted an adversarial reinforcement learning framework to mitigate the interference of erroneous information received during the drone's escape process on drone decision-making. Based on the adversarial interaction between the pursuer and the evader, the study optimizes the evader's strategy to counteract the pursuer's behavior. Additionally, to address the sparse reward problem in traditional reinforcement learning methods, this paper proposes a stepwise reward function in conjunction with the artificial potential field method, allowing the drone to better adapt to the pursuit environment. The research shows that compared to the PPO algorithm, the escape success rate of drones has increased by 18.14%, and the transportation time has been reduced by 52.2%, significantly improving the drone's transportation efficiency. This study provides a new technical solution for drone security protection and explores the application potential of adversarial reinforcement learning in scenarios of malicious pursuit.
文章导航

/