流体力学与飞行力学

基于深度强化学习的空战机动决策试验

  • 章胜 ,
  • 周攀 ,
  • 何扬 ,
  • 黄江涛 ,
  • 刘刚 ,
  • 唐骥罡 ,
  • 贾怀智 ,
  • 杜昕
展开
  • 1.中国空气动力研究与发展中心 空天技术研究所,绵阳 621000 2.中国空气动力研究与发展中心,绵阳 621000
    3.西北工业大学 航空学院,西安 710000
.E-mail: hjtcyf@163.com

收稿日期: 2022-10-08

  修回日期: 2023-01-05

  录用日期: 2023-02-15

  网络出版日期: 2023-02-24

基金资助

国家自然科学基金(11902332)

Air combat maneuver decision-making test based on deep reinforcement learning

  • Sheng ZHANG ,
  • Pan ZHOU ,
  • Yang HE ,
  • Jiangtao HUANG ,
  • Gang LIU ,
  • Jigang TANG ,
  • Huaizhi JIA ,
  • Xin DU
Expand
  • 1.Aerospace Technology Institute,China Aerodynamics Research and Development Center,Mianyang 621000,China
    2.China Aerodynamics Research and Development Center,Mianyang 621000,China
    3.School of Aeronautics,Northwestern Polytechnical University,Xi’an 710000,China
E-mail: hjtcyf@163.com

Received date: 2022-10-08

  Revised date: 2023-01-05

  Accepted date: 2023-02-15

  Online published: 2023-02-24

Supported by

National Natural Science Foundation of China(11902332)

摘要

空战智能决策将极大改变未来战争的形态与模式。深度强化学习决策机可以挖掘飞行器潜力,是实现空战智能决策的重要技术范式,但其工程实现鲜有报道。针对基于深度强化学习的双机近距空战机动智能决策的工程实现问题,开发了适于应用的深度神经网络在线机动决策模型,发展了通过飞行控制律跟踪航迹导引决策指令的机动控制方案,并进一步开展了软硬件实现工作与人机对抗飞行试验,实现了智能空战从虚拟仿真到真实飞行的迁移。研究结果表明基于本文发展的近距空战机动决策及控制方法,智能无人机在与人类“飞行员”的对抗中能够迅速做出有利于己方的动作决策,通过机动快速占据态势优势。研究结果显示了深度神经网络智能决策技术在空战决策中的潜在应用价值。

本文引用格式

章胜 , 周攀 , 何扬 , 黄江涛 , 刘刚 , 唐骥罡 , 贾怀智 , 杜昕 . 基于深度强化学习的空战机动决策试验[J]. 航空学报, 2023 , 44(10) : 128094 -128094 . DOI: 10.7527/S1000-6893.2023.28094

Abstract

The air combat intelligent decision-making will greatly change the form of wars. Deep reinforcement learning decision-making machine, as an important technical paradigm to realize the intelligent decision-making in air combat, can explore the potential of unmanned aircraft. However, reports on its engineering implementation are rare. Aimed at the practical implementation of the maneuver intelligent decision-making based on deep reinforcement learning in the one-to-one fighters’ close-range air combat, an online deep neural network maneuver decision-making model suitable for application is developed. The maneuver control scheme that the trajectory guidance decision-making commands are tracked with the flight control law is proposed. The corresponding software and hardware architectures are realized and the human-machine combat flight test is carried out, which achieves the transfer from virtual simulation to real flight in intelligent air combat. The research results show that, based on the close-range air combat maneuver decision-making and control method developed in this paper, the intelligent unmanned aircraft can make logical maneuver decisions quickly in favor of its own side and thus is soon in the advantageous situation by maneuver when combatting with human “pilots”. The flight test results demonstrate the potential application value of the deep neural network intelligent decision-making machine in air combat decision-making.

参考文献

1 樊会涛, 闫俊. 空战体系的演变及发展趋势[J]. 航空学报202243(10): 527397.
  FAN H T, YAN J. Evolution and development trend of air combat system[J]. Acta Aeronautica et Astronautica Sinica202243(10): 527397 (in Chinese).
2 孙智孝, 杨晟琦, 朴海音, 等. 未来智能空战发展综述[J]. 航空学报202142(8): 525799.
  SUN Z X, YANG S Q, PIAO H Y, et al. A survey of air combat artificial intelligence [J]. Acta Aeronautica et Astronautica Sinica202142(8): 525799 (in Chinese).
3 孙聪. 从空战制胜机理演变看未来战斗机发展趋势[J]. 航空学报202142(8): 525826.
  SUN C. Development trend of future fighter: a review of evolution of winning mechanism in air combat[J]. Acta Aeronautica et Astronautica Sinica202142(8): 525826 (in Chinese).
4 NICHOLS S O. 21st century air-to-air short range weapon requirementsf: AU/ACSC/210/1998-04 [R]. Alabama: Maxwell Air Force Base, 1998.
5 董一群, 艾剑良. 自主空战技术中的机动决策:进展与展望[J]. 航空学报202041(): 724264.
  DONG Y Q, AI J L. Decision making in autonomous air combat: review and prospects[J]. Acta Aeronautica et Astronautica Sinica202041(Sup 2): 724264 (in Chinese).
6 BURGIN G H. OWENS A J. An adaptive maneuvering logic computer program for the simulation of one-on-one air-to-air combat [R]. Washington D. C.: NASA. 1975.
7 ISAACS R. Differential games: A mathematical theory with applications to warfare and pursuit, control and optimization[M]. New York: Wiley, 1965
8 薛羽, 庄毅, 张友益, 等. 基于启发式自适应离散差分进化算法的多UCAV协同干扰空战决策[J]. 航空学报201334(2): 343-351.
  XUE Y, ZHUANG Y, ZHANG Y Y, et al. Multiple UCAV cooperative jamming air combat decision making based on heuristic self-adaptive discrete differential evolution algorithm[J]. Acta Aeronautica et Astronautica Sinica201334(2): 343-351 (in Chinese).
9 RODIN E Y, LIROV Y, MITTNIK S, et al. Artificial intelligence in air combat games[J]. Computers & Mathematics With Applications198713(1-3): 261-274.
10 ERNEST N, CARROLL D. Genetic fuzzy based artificial intelligence for unmanned combat aerial vehicle control in simulated air combat missions[J]. Journal of Defense Management20166(1), doi: 10.4172/2167-0374.1000144 .
11 Defense Advanced Research Projects Agency. AlphaDogfight trials go virtual for final event [EB/OL]. (2020-08-07) [2021-03-10]. :.
12 POPE A P, IDE J S, MI?OVI? D, et al. Hierarchical reinforcement learning for air-to-air combat[C]∥2021 International Conference on Unmanned Aircraft Systems (ICUAS). Piscataway: IEEE Press, 2021: 275-284.
13 杜子亮. DARPA“空战进化”项目开启良好开端[J]. 国际航空2020(9): 20-22.
  DU Z L. Good start for DARPA’s air combat evolution program[J]. International Aviation2020(9): 20-22 (in Chinese).
14 李磊, 蒋琪, 王彤. 美国DARPA空战演变项目分析[J]. 飞航导弹2020(4): 52-58.
  LI L, JIANG Q, WANG T. Analysis of DARPA air combat evolution project in America[J]. Aerodynamic Missile Journal2020(4): 52-58 (in Chinese).
15 左家亮, 杨任农, 张滢, 等. 基于启发式强化学习的空战机动智能决策[J]. 航空学报201738(10): 321168.
  ZUO J L, YANG R N, ZHANG Y, et al. Intelligent decision-making in air combat maneuvering based on heuristic reinforcement learning[J]. Acta Aeronautica et Astronautica Sinica201738(10): 321168 (in Chinese).
16 张强, 杨任农, 俞利新, 等. 基于Q-network强化学习的超视距空战机动决策[J]. 空军工程大学学报(自然科学版)201819(6): 8-14.
  ZHANG Q, YANG R N, YU L X, et al. BVR air combat maneuvering decision by using Q-network reinforcement learning[J]. Journal of Air Force Engineering University (Natural Science Edition)201819(6): 8-14 (in Chinese).
17 张耀中, 许佳林, 姚康佳, 等. 基于DDPG算法的无人机集群追击任务[J]. 航空学报202041(10): 324000.
  ZHANG Y Z, XU J L, YAO K J, et al. Pursuit missions for UAV swarms based on DDPG algorithm[J]. Acta Aeronautica et Astronautica Sinica202041(10): 324000 (in Chinese).
18 施伟, 冯旸赫, 程光权, 等. 基于深度强化学习的多机协同空战方法研究[J]. 自动化学报202147(7): 1610-1623.
  SHI W, FENG Y H, CHENG G Q, et al. Research on multi-aircraft cooperative air combat method based on deep reinforcement learning[J]. Acta Automatica Sinica202147(7): 1610-1623 (in Chinese).
19 王壮. 近距空战飞行器智能机动决策生成研究[D]. 成都: 四川大学, 2021.
  WANG Z. Research on intelligent maneuver decision generation of within visual range air combat[D]. Chengdu: Sichuan University, 2021 (in Chinese).
20 周攀, 黄江涛, 章胜, 等. 基于深度强化学习的智能空战决策与仿真[J]. 航空学报202344(4): 126731.
  ZHOU P, HUANG J T, ZHANG S, et al. Intelligent air combat decision and simulation based on deep reinforcement learning [J]. Acta Aeronautica et Astronautica Sinica202344(4): 126731 (in Chinese).
21 符小卫, 徐哲, 朱金冬, 等. 基于PER-MATD3的多无人机攻防对抗机动决策研究[J]. 航空学报, doi: 10.7527/S1000-6893.2022.27083 .
  FU X W, XU Z, ZHU J D, et al. Research on maneuvering decision-making of multi-UAV attack-defence confrontation based on PER-MATD3[J]. Acta Aeronautica et Astronautica Sinica, doi: 10.7527/S1000-6893.2022.27083 (in Chinese).
22 高飞. 人工智能持续推进DARPA“空战演进”项目将迎来新进展[N]. 中国航空报, 2021-08-31(A09).
  GAO F. Continuous promotion of artificial intelligence, DARPA “Air Combat Evolution” project will usher in new progress [N]. China Aviation News, 2021-08-31(A09)(in Chinese).
23 杨伟. 关于未来战斗机发展的若干讨论[J]. 航空学报202041(6): 524377.
  YANG W. Development of future fighters[J]. Acta Aeronautica et Astronautica Sinica202041(6): 524377 (in Chinese).
24 吴森堂, 费玉华. 飞行控制系统[M]. 北京: 北京航空航天大学出版社, 2005: 8-13.
  WU S T, FEI Y H. Flight control[M]. Beijing: Beijing University of Aeronautics & Astronautics Press, 2005: 8-13 (in Chinese).
25 王栋, 寇雅楠, 胡涛. 智能空战对抗训练关键技术研究[M]. 北京: 电子工业出版社, 2021.
  WANG D, KOU Y N, HU T. Research on key technologies of intelligent air combat countermeasure training[M]. Beijing: Publishing House of Electronics Industry, 2021 (in Chinese).
26 李银通, 韩统, 孙楚, 等. 基于逆强化学习的空战态势评估函数优化方法[J]. 火力与指挥控制201944(8): 101-106.
  LI Y T, HAN T, SUN C, et al. An optimization method of air combat situation assessment function based on inverse reinforcement learning[J]. Fire Control & Command Control201944(8): 101-106 (in Chinese).
27 赵冬斌, 邵坤, 朱圆恒, 等. 深度强化学习综述: 兼论计算机围棋的发展[J]. 控制理论与应用201633(6): 701-717.
  ZHAO D B, SHAO K, ZHU Y H, et al. Review of deep reinforcement learning and discussions on the development of computer Go[J]. Control Theory & Applications201633(6): 701-717 (in Chinese).
28 SILVER D. Tutorial: Deep reinforcement learning, Google DeepMind, 2020[R/OL]. [2022-10-31].. .
29 FUJIMOTO S, VAN HOOF H, MEGER D. Addressing function approximation error in actor-critic methods[DB/OL]. prepint arXiv: , 2018.
30 SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized experience replay [DB/OL]. prepint arXiv: arXiv: , 2015.
31 钟友武, 柳嘉润, 杨凌宇, 等. 自主近距空战中机动动作库及其综合控制系统[J]. 航空学报200829(): 114-121.
  ZHONG Y W, LIU J R, YANG L Y, et al. Maneuver library and integrated control system for autonomous close-in air combat[J]. Acta Aeronautica et Astronautica Sinica200829(Sup 1): 114-121 (in Chinese).
32 STEVENS B L, LEWIS F L, JOHNSON E N. Aircraft control and simulation: Dynamics, controls design, and autonomous systems[M]. 3rd ed. New York: Wiley-Blackwell, 2015.
文章导航

/