导航

ACTA AERONAUTICAET ASTRONAUTICA SINICA ›› 2023, Vol. 44 ›› Issue (7): 327083-327083.doi: 10.7527/S1000-6893.2022.27083

• Electronics and Electrical Engineering and Control • Previous Articles     Next Articles

Maneuvering decision-making of multi-UAV attack-defence confrontation based on PER-MATD3

Xiaowei FU1(), Zhe XU2, Jindong ZHU1,3, Nan WANG1   

  1. 1.School of Electronics and Information,Northwestern Polytechnical University,Xi’an 710129,China
    2.Xi’an Institute of Applied Optics,Xi’an 710065,China
    3.AVIC Shenyang Aircraft Design Research Institute,Shenyang 110035,China
  • Received:2022-02-28 Revised:2022-03-23 Accepted:2022-05-11 Online:2023-04-15 Published:2022-05-19
  • Contact: Xiaowei FU E-mail:fxw@nwpu.edu.cn
  • Supported by:
    Aeronautical Science Foundation of China(2020Z023053001)

Abstract:

This paper explores multi-UAVs attack-defence confrontation maneuvering decision-making in a complex environment with random distribution of obstacles. A motion model and a radar detection model for both attack and defence sides are constructed. the Twin Delayed Deep Deterministic policy gradient (TD3) algorithm is extended to the multi-agent field to solve the problem of overestimation of the value function in the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm. To improve the learning efficiency of the algorithm, a Prioritized Experience Replay Multi-Agent Twin Delayed Deep Deterministic policy gradient (PER-MATD3) algorithm is proposed based on the priority experience playback mechanism. The simulation experiments show that the method proposed in this paper has a good confrontation effect in multi-UAV attack-defence confrontation maneuvering decision making, and the advantages of the PER-MATD3 algorithm over other algorithms in terms of convergence speed and stability are verified by comparison.

Key words: multi-UAVs, multi-agent reinforcement learning, PER-MATD3, attack-defence confrontation, maneuvering decision-making

CLC Number: