导航

ACTA AERONAUTICAET ASTRONAUTICA SINICA ›› 2020, Vol. 41 ›› Issue (10): 324000-324000.doi: 10.7527/S1000-6893.2020.24000

• Electronics and Electrical Engineering and Control • Previous Articles     Next Articles

Pursuit missions for UAV swarms based on DDPG algorithm

ZHANG Yaozhong1, XU Jialin1, YAO Kangjia1, LIU Jieling2   

  1. 1. School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710072, China;
    2. Xi'an North Electro-optic Science & Technology Co. Ltd, Xi'an 710043, China
  • Received:2020-03-21 Revised:2020-06-15 Published:2020-06-12
  • Supported by:
    Aeronautical Science Foundation of China (2017ZC53033)

Abstract: The Unmanned Aerial Vehicle (UAV) swarm technology is one of the research hotspots in recent years. With continuous advancement in autonomous intelligence of UAVs, the UAV swarm technology is bound to become one of the main trends of UAV development in the future. In view of the collaborative pursuit missions of UAV swarms against the enemy, we establish a typical task scenario, and, based on the Deep Deterministic Policy Gradient (DDPG) algorithm, design a guided reward function which effectively solves the sparse rewards problem of deep intensive learning during long-period missions. We introduce a sliding average based soft updating strategy to reduce parameter oscillations in the Eval network and the target network during the training process, thereby improving the training efficiency. The simulation results show that after training, the UAV swarm can successfully carry out the pursuit missions with a success rate of 95%. The UAV swarm technology as a brand new combat mode has a potential application value for application in the military field, and this artificial intelligence algorithm has a certain application prospect in the development of autonomous decision-making by UAV swarms.

Key words: DDPG algorithm, UAV swarms, task decision, deep reinforcement learning, sparse rewards

CLC Number: