导航

Acta Aeronautica et Astronautica Sinica ›› 2025, Vol. 46 ›› Issue (17): 331637.doi: 10.7527/S1000-6893.2024.31637

• Electronics and Electrical Engineering and Control • Previous Articles     Next Articles

Adversarial reinforcement learning-based UAV escape path planning method

Xiangsong HUANG1,2, Mengyu WANG1, Dapeng PAN1,2()   

  1. 1.College of Information And Communication Engineering,Harbin Engineering University,Harbin 150001,China
    2.Key Laboratory of Advanced Marine Communication and Information Technology,Ministry of Industry and Information Technology,Harbin Engineering University,Harbin 150001,China
  • Received:2024-12-09 Revised:2025-01-10 Accepted:2025-03-18 Online:2025-04-15 Published:2025-04-07
  • Contact: Dapeng PAN E-mail:pandapeng@hrbeu.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62001136)

Abstract:

In the context of the rapid development of drone technology, how to deal with malicious pursuit by other drones has become an important issue in drone security protection. To address the problem of enhancing a drone’s adaptability and survivability in hostile environments using adversarial reinforcement learning algorithms, this work employs an adversarial reinforcement learning framework. Specifically, it tackles the issue of erroneous information interfering with decision-making during the evasion process. Building upon the adversarial interaction between pursuers and evaders, the strategy of the transport drone is optimized to counter the pursuers’ behavior. To overcome the sparse reward problem inherent in traditional reinforcement learning methods, a progressive reward strategy mechanism incorporating the artificial potential field method is proposed. This enables the drone to adapt more effectively to the pursuit environment. The results demonstrate that, compared to the Proximal Policy Optimization (PPO) algorithm, this algorithm increases the drone’s escape success rate by 54.47% and simultaneously reduces transport time by 34.35%, significantly enhancing the drone’s transport efficiency. These findings provide a new technical solution for drone security protection and explore the application potential of adversarial reinforcement learning in scenarios involving malicious pursuit.

Key words: adversarial training, reinforcement learning, escape path planning, escape decision making, reward function

CLC Number: