导航

ACTA AERONAUTICAET ASTRONAUTICA SINICA ›› 2023, Vol. 44 ›› Issue (4): 126731-126731.doi: 10.7527/S1000-6893.2022.26731

• Fluid Mechanics and Flight Mechanics • Previous Articles     Next Articles

Intelligent air combat decision making and simulation based on deep reinforcement learning

Pan ZHOU1, Jiangtao HUANG1(), Sheng ZHANG1, Gang LIU2, Bowen SHU1,3, Jigang TANG1   

  1. 1.Aerospace Technology Institute,China Aerodynamics Research and Development Center,Mianyang  621000,China
    2.China Aerodynamics Research and Development Center,Mianyang  621000,China
    3.School of Aeronautics,Northwestern Polytechnical University,Xi’an  710072,China
  • Received:2021-12-02 Revised:2022-01-12 Accepted:2022-01-17 Online:2022-01-28 Published:2022-01-26
  • Contact: Jiangtao HUANG E-mail:hjtcyf@163.com
  • Supported by:
    Provincial or Ministry Level Project

Abstract:

Intelligent decision-making for aircraft air combat is a research hotspot of military powers in the world today. To solve the problem of Unmanned Aerial Vehicle (UAV) maneuvering decision-making in the close-range air combat game, an autonomous decision-making model based on deep reinforcement learning is proposed, where a reward function comprehensively considering the attack angle advantage, speed advantage, altitude advantage and distance advantage is adopted and improved. The improved reward function avoids the problem that the agent is induced to fall to the ground by the enemy aircraft, and can effectively guide the agent to converge to the optimal solution. Aiming at the problem of slow convergence caused by random sampling in reinforcement learning, we design a value-based prioritization method for experience pool samples. Under the premise of ensuring the algorithm convergence, the convergence speed of the algorithm is significantly accelerated. The decision-making model is verified based on the human-machine confrontation simulation platform, and the results show that the model can suppress the expert system and the driver in the process of close air combat.

Key words: air combat, independent decision-making, deep reinforcement learning, TD3 algorithm, sparse rewards

CLC Number: