ACTA AERONAUTICAET ASTRONAUTICA SINICA ›› 2023, Vol. 44 ›› Issue (4): 126731.doi: 10.7527/S1000-6893.2022.26731
• Fluid Mechanics and Flight Mechanics • Previous Articles Next Articles
Pan ZHOU1, Jiangtao HUANG1(), Sheng ZHANG1, Gang LIU2, Bowen SHU1,3, Jigang TANG1
Received:
2021-12-02
Revised:
2022-01-12
Accepted:
2022-01-17
Online:
2022-01-28
Published:
2022-01-26
Contact:
Jiangtao HUANG
E-mail:hjtcyf@163.com
Supported by:
CLC Number:
Pan ZHOU, Jiangtao HUANG, Sheng ZHANG, Gang LIU, Bowen SHU, Jigang TANG. Intelligent air combat decision making and simulation based on deep reinforcement learning[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(4): 126731.
1 | SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529(7587): 484-489. |
2 | Defense Advanced Research Projects Agency. AlphaGogfight trials go virtual for final event [EB/OL]. (2020-08-07) [2021-03-10]. :. |
3 | 孙智孝, 杨晟琦, 朴海音, 等. 未来智能空战发展综述[J]. 航空学报, 2021, 42(8): 525799. |
SUN Z X, YANG S Q, PIAO H Y, et al. A survey of air combat artificial intelligence[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(8): 525799 (in Chinese). | |
4 | PARK H, LEE B Y, TAHK M J, et al. Differential game based air combat maneuver generation using scoring function matrix[J]. International Journal of Aeronautical and Space Sciences, 2016, 17(2): 204-213. |
5 | WEINTRAUB I E, PACHTER M, GARCIA E. An introduction to pursuit-evasion differential games[C]∥ 2020 American Control Conference (ACC). Piscataway: IEEE Press, 2020: 1049-1066. |
6 | MCGREW J S. Real-time maneuvering decisions for autonomous air combat[D]. Cambridge: Massachusetts Institute of Technology, 2008: 91-104. |
7 | KANESHIGE J, KRISHNAKUMAR K. Artificial immune system approach for air combat maneuvering[C]∥Proceeding of the SPIE, 2007. |
8 | 薛羽, 庄毅, 张友益, 等. 基于启发式自适应离散差分进化算法的多UCAV协同干扰空战决策[J]. 航空学报, 2013, 34(2): 343-351. |
XUE Y, ZHUANG Y, ZHANG Y Y, et al. Multiple UCAV cooperative jamming air combat decision making based on heuristic self-adaptive discrete differential evolution algorithm[J]. Acta Aeronautica et Astronautica Sinica, 2013, 34(2): 343-351 (in Chinese). | |
9 | BURGIN G H. Improvements to the adaptive maneuvering logic program: NASA CR 3985[R]. Washington, D.C.: NASA, 1986. |
10 | 左家亮, 杨任农, 张滢, 等. 基于启发式强化学习的空战机动智能决策[J]. 航空学报, 2017, 38(10): 321168. |
ZUO J L, YANG R N, ZHANG Y, et al. Intelligent decision-making in air combat maneuvering based on heuristic reinforcement learning[J]. Acta Aeronautica et Astronautica Sinica, 2017, 38(10): 321168 (in Chinese). | |
11 | 张耀中, 许佳林, 姚康佳, 等. 基于DDPG算法的无人机集群追击任务[J]. 航空学报, 2020, 41(10): 324000. |
ZHANG Y Z, XU J L, YAO K J, et al. Pursuit missions for UAV swarms based on DDPG algorithm[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(10): 324000 (in Chinese). | |
12 | 杜海文, 崔明朗, 韩统, 等 .基于多目标优化与强化学习的空战机动决策[J].北京航空航天大学学报,2018, 44 (11) : 2247-2256. |
DU H W, CUI M L, HAN T, et al. Maneuvering decision in air combat based on multi-objective optimization and reinforcement learning [J]. Journal of Beijing University of Aeronautics and Astronautics, 2018, 44(11): 2247-2256 (in Chinese). | |
13 | 施伟, 冯旸赫, 程光权, 等. 基于深度强化学习的多机协同空战方法研究[J]. 自动化学报, 2021, 47(7): 1610-1623. |
SHI W, FENG Y H, CHENG G Q, et al. Research on multi-aircraft cooperative air combat method based on deep reinforcement learning[J]. Acta Automatica Sinica, 2021, 47(7): 1610-1623 (in Chinese). | |
14 | 张强, 杨任农, 俞利新, 等. 基于Q-network强化学习的超视距空战机动决策[J]. 空军工程大学学报(自然科学版), 2018, 19(6): 8-14. |
ZHANG Q, YANG R N, YU L X, et al. BVR air combat maneuvering decision by using Q-network reinforcement learning[J]. Journal of Air Force Engineering University (Natural Science Edition), 2018, 19(6): 8-14 (in Chinese). | |
15 | 李银通, 韩统, 孙楚, 等. 基于逆强化学习的空战态势评估函数优化方法[J]. 火力与指挥控制, 2019, 44(8): 101-106. |
LI Y T, HAN T, SUN C, et al. An optimization method of air combat situation assessment function based on inverse reinforcement learning[J]. Fire Control & Command Control, 2019, 44(8): 101-106 (in Chinese). | |
16 | SUTTON R S, BARTO A G. Reinforcement learning: an introduction[M]. 2nd ed. London: MIT Press, 2018. |
17 | HINTON G E, OSINDERO S, TEH Y W. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(7): 1527-1554. |
18 | WATKINS C J C H, DAYAN P. Q-learning[J]. Machine Learning, 1992, 8(3): 279-292. |
19 | RUMMERY G A, NIRANJAN M. On-line Q-learning using connectionist systems[M]. Cambridge:University of Cambridge, 1994. |
20 | SCHULMAN J, LEVINE S, MORITZ P, et al. Trust region policy optimization[C]∥ Proceedings of the 31st International Conference on Machine Learning, 2015: 1889-1897. |
21 | SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[EB/OL]. 2017: arXiv: 1707.06347. . |
22 | KONDA V R, TSITSIKLIS J N. OnActor-critic algorithms[J]. SIAM Journal on Control and Optimization, 2003, 42(4): 1143-1166. |
23 | LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[C]∥4th International Conference on Learning Representations, ICLR 2016-Conference Track Proceedings, 2016. |
24 | MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533. |
25 | FUJIMOTO S, VAN HOOF H, MEGER D. Addressing function approximation error in actor-critic methods[C]∥Proceedings of the 35th International Conference on Machine Learning, 2018: 1587-1596. |
26 | 魏航. 基于强化学习的无人机空中格斗算法研究[D]. 哈尔滨: 哈尔滨工业大学, 2015: 42-43. |
WEI H. Research of UCAV air combat based on reinforcemnt learning[D]. Harbin: Harbin Institute of Technology, 2015: 42-43 (in Chinese). | |
27 | 钟友武, 柳嘉润, 杨凌宇, 等. 自主近距空战中机动动作库及其综合控制系统[J]. 航空学报, 2008, 29(S1): 114-121. |
ZHONG Y W, LIU J R, YANG L Y, et al. Maneuver library and integrated control system for autonomous close-in air combat[J]. Acta Aeronautica et Astronautica Sinica, 2008, 29(1): 114-121 (in Chinese). |
[1] | Honglin ZHANG, Jianjun LUO, Weihua MA. Spacecraft game decision making for threat avoidance of space targets based on machine learning [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(8): 329136-329136. |
[2] | Yunpeng CAI, Dapeng ZHOU, Jiangchuan DING. Intelligent collaborative control of UAV swarms with collision avoidance safety constraints [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(5): 529683-529683. |
[3] | Shengzhe SHAN, Weiwei ZHANG. Air combat intelligent decision-making method based on self-play and deep reinforcement learning [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(4): 328723-328723. |
[4] | Zuolong LI, Jihong ZHU, Minchi KUANG, Jie ZHANG, Jie REN. Hierarchical decision algorithm for air combat with hybrid action based on deep reinforcement learning [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(17): 530053-530053. |
[5] | Wentao LI, Feng FANG, Zhenya WANG, Yichao ZHU, Dongliang PENG. Intelligent maneuvering decision-making in two-UCAV cooperative air combat based on improved MADDPG with hybrid hyper network [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(17): 529460-529460. |
[6] | Tiancai WU, Honglun WANG, Bin REN, Yiheng LIU, Xingyu WU, Guocheng YAN. Learning-based integrated fault-tolerant guidance and control for hypersonic vehicles considering avoidance and penetration [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(15): 329607-329607. |
[7] | Baolai WANG, Xianzhong GAO, Tao XIE, Zhongxi HOU. Decision⁃making in close⁃range air combat based on reinforcement learning and population game [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(12): 329446-329446. |
[8] | Xuejian WANG, Yongming WEN, Xiaorong SHI, Ningning ZHANG, Jiexi LIU. Design of hybrid intelligent decision framework for multi⁃agent and multi⁃coupling tasks [J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(S2): 729770-729770. |
[9] | Jinyi MA, Can WANG, Tao XUE, Jianliang AI, Yiqun DONG. Development and illustrative applications of an air combat engagement database [J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(S1): 727538-727538. |
[10] | Baichuan ZHANG, Wenhao BI, An ZHANG, Zeming MAO, Mi YANG. Transformer-based error compensation method for air combat aircraft trajectory prediction [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(9): 327413-327413. |
[11] | Xizhen GAO, Liang TANG, Huang HUANG. Deep reinforcement learning in autonomous manipulation for celestial bodies exploration: Applications and challenges [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(6): 26762-026762. |
[12] | Xiangwei ZHU, Dan SHEN, Kai XIAO, Yuexin MA, Xiang LIAO, Fuqiang GU, Fangwen YU, Kefu GAO, Jingnan LIU. Mechanisms, algorithms, implementation and perspectives of brain⁃inspired navigation [J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(19): 28569-028569. |
[13] | Lei DONG, Hongbing CHEN, Xi CHEN, Changxiao ZHAO. Distributed multi-agent coalition task allocation strategy for single pilot operation mode based on DQN [J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(13): 327895-327895. |
[14] | Wenxue CHEN, Changsheng GAO, Wuxing JING. Trust region policy optimization guidance algorithm for intercepting maneuvering target [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(11): 327596-327596. |
[15] | Sheng ZHANG, Pan ZHOU, Yang HE, Jiangtao HUANG, Gang LIU, Jigang TANG, Huaizhi JIA, Xin DU. Air combat maneuver decision-making test based on deep reinforcement learning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(10): 128094-128094. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
Address: No.238, Baiyan Buiding, Beisihuan Zhonglu Road, Haidian District, Beijing, China
Postal code : 100083
E-mail:hkxb@buaa.edu.cn
Total visits: 6658907 Today visits: 1341All copyright © editorial office of Chinese Journal of Aeronautics
All copyright © editorial office of Chinese Journal of Aeronautics
Total visits: 6658907 Today visits: 1341