ACTA AERONAUTICAET ASTRONAUTICA SINICA ›› 2023, Vol. 44 ›› Issue (4): 126731-126731.doi: 10.7527/S1000-6893.2022.26731
• Fluid Mechanics and Flight Mechanics • Previous Articles Next Articles
Pan ZHOU1, Jiangtao HUANG1(), Sheng ZHANG1, Gang LIU2, Bowen SHU1,3, Jigang TANG1
Received:
2021-12-02
Revised:
2022-01-12
Accepted:
2022-01-17
Online:
2022-01-28
Published:
2022-01-26
Contact:
Jiangtao HUANG
E-mail:hjtcyf@163.com
Supported by:
CLC Number:
Pan ZHOU, Jiangtao HUANG, Sheng ZHANG, Gang LIU, Bowen SHU, Jigang TANG. Intelligent air combat decision making and simulation based on deep reinforcement learning[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(4): 126731-126731.
1 | SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529(7587): 484-489. |
2 | Defense Advanced Research Projects Agency. AlphaGogfight trials go virtual for final event [EB/OL]. (2020-08-07) [2021-03-10]. :. |
3 | 孙智孝, 杨晟琦, 朴海音, 等. 未来智能空战发展综述[J]. 航空学报, 2021, 42(8): 525799. |
SUN Z X, YANG S Q, PIAO H Y, et al. A survey of air combat artificial intelligence[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(8): 525799 (in Chinese). | |
4 | PARK H, LEE B Y, TAHK M J, et al. Differential game based air combat maneuver generation using scoring function matrix[J]. International Journal of Aeronautical and Space Sciences, 2016, 17(2): 204-213. |
5 | WEINTRAUB I E, PACHTER M, GARCIA E. An introduction to pursuit-evasion differential games[C]∥ 2020 American Control Conference (ACC). Piscataway: IEEE Press, 2020: 1049-1066. |
6 | MCGREW J S. Real-time maneuvering decisions for autonomous air combat[D]. Cambridge: Massachusetts Institute of Technology, 2008: 91-104. |
7 | KANESHIGE J, KRISHNAKUMAR K. Artificial immune system approach for air combat maneuvering[C]∥Proceeding of the SPIE, 2007. |
8 | 薛羽, 庄毅, 张友益, 等. 基于启发式自适应离散差分进化算法的多UCAV协同干扰空战决策[J]. 航空学报, 2013, 34(2): 343-351. |
XUE Y, ZHUANG Y, ZHANG Y Y, et al. Multiple UCAV cooperative jamming air combat decision making based on heuristic self-adaptive discrete differential evolution algorithm[J]. Acta Aeronautica et Astronautica Sinica, 2013, 34(2): 343-351 (in Chinese). | |
9 | BURGIN G H. Improvements to the adaptive maneuvering logic program: NASA CR 3985[R]. Washington, D.C.: NASA, 1986. |
10 | 左家亮, 杨任农, 张滢, 等. 基于启发式强化学习的空战机动智能决策[J]. 航空学报, 2017, 38(10): 321168. |
ZUO J L, YANG R N, ZHANG Y, et al. Intelligent decision-making in air combat maneuvering based on heuristic reinforcement learning[J]. Acta Aeronautica et Astronautica Sinica, 2017, 38(10): 321168 (in Chinese). | |
11 | 张耀中, 许佳林, 姚康佳, 等. 基于DDPG算法的无人机集群追击任务[J]. 航空学报, 2020, 41(10): 324000. |
ZHANG Y Z, XU J L, YAO K J, et al. Pursuit missions for UAV swarms based on DDPG algorithm[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(10): 324000 (in Chinese). | |
12 | 杜海文, 崔明朗, 韩统, 等 .基于多目标优化与强化学习的空战机动决策[J].北京航空航天大学学报,2018, 44 (11) : 2247-2256. |
DU H W, CUI M L, HAN T, et al. Maneuvering decision in air combat based on multi-objective optimization and reinforcement learning [J]. Journal of Beijing University of Aeronautics and Astronautics, 2018, 44(11): 2247-2256 (in Chinese). | |
13 | 施伟, 冯旸赫, 程光权, 等. 基于深度强化学习的多机协同空战方法研究[J]. 自动化学报, 2021, 47(7): 1610-1623. |
SHI W, FENG Y H, CHENG G Q, et al. Research on multi-aircraft cooperative air combat method based on deep reinforcement learning[J]. Acta Automatica Sinica, 2021, 47(7): 1610-1623 (in Chinese). | |
14 | 张强, 杨任农, 俞利新, 等. 基于Q-network强化学习的超视距空战机动决策[J]. 空军工程大学学报(自然科学版), 2018, 19(6): 8-14. |
ZHANG Q, YANG R N, YU L X, et al. BVR air combat maneuvering decision by using Q-network reinforcement learning[J]. Journal of Air Force Engineering University (Natural Science Edition), 2018, 19(6): 8-14 (in Chinese). | |
15 | 李银通, 韩统, 孙楚, 等. 基于逆强化学习的空战态势评估函数优化方法[J]. 火力与指挥控制, 2019, 44(8): 101-106. |
LI Y T, HAN T, SUN C, et al. An optimization method of air combat situation assessment function based on inverse reinforcement learning[J]. Fire Control & Command Control, 2019, 44(8): 101-106 (in Chinese). | |
16 | SUTTON R S, BARTO A G. Reinforcement learning: an introduction[M]. 2nd ed. London: MIT Press, 2018. |
17 | HINTON G E, OSINDERO S, TEH Y W. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(7): 1527-1554. |
18 | WATKINS C J C H, DAYAN P. Q-learning[J]. Machine Learning, 1992, 8(3): 279-292. |
19 | RUMMERY G A, NIRANJAN M. On-line Q-learning using connectionist systems[M]. Cambridge:University of Cambridge, 1994. |
20 | SCHULMAN J, LEVINE S, MORITZ P, et al. Trust region policy optimization[C]∥ Proceedings of the 31st International Conference on Machine Learning, 2015: 1889-1897. |
21 | SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[EB/OL]. 2017: arXiv: 1707.06347. . |
22 | KONDA V R, TSITSIKLIS J N. OnActor-critic algorithms[J]. SIAM Journal on Control and Optimization, 2003, 42(4): 1143-1166. |
23 | LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[C]∥4th International Conference on Learning Representations, ICLR 2016-Conference Track Proceedings, 2016. |
24 | MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533. |
25 | FUJIMOTO S, VAN HOOF H, MEGER D. Addressing function approximation error in actor-critic methods[C]∥Proceedings of the 35th International Conference on Machine Learning, 2018: 1587-1596. |
26 | 魏航. 基于强化学习的无人机空中格斗算法研究[D]. 哈尔滨: 哈尔滨工业大学, 2015: 42-43. |
WEI H. Research of UCAV air combat based on reinforcemnt learning[D]. Harbin: Harbin Institute of Technology, 2015: 42-43 (in Chinese). | |
27 | 钟友武, 柳嘉润, 杨凌宇, 等. 自主近距空战中机动动作库及其综合控制系统[J]. 航空学报, 2008, 29(S1): 114-121. |
ZHONG Y W, LIU J R, YANG L Y, et al. Maneuver library and integrated control system for autonomous close-in air combat[J]. Acta Aeronautica et Astronautica Sinica, 2008, 29(1): 114-121 (in Chinese). |
[1] | FU Xiaowei, WANG Hui, XU Zhe. Cooperative pursuit strategy for multi-UAVs based on DE-MADDPG algorithm [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2022, 43(5): 325311-325311. |
[2] | FAN Huitao, YAN Jun. Evolution and development trend of air combat system [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2022, 43(10): 527397-527397. |
[3] | SUN Cong. Development trend of future fighter: A review of evolution of winning mechanism in air combat [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(8): 525826-525826. |
[4] | SUN Zhixiao, YANG Shengqi, PIAO Haiyin, BAI Chengchao, GE Jun. A survey of air combat artificial intelligence [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(8): 525799-525799. |
[5] | REN Feng, GAO Chuanqiang, TANG Hui. Machine learning for flow control: Applications and development trends [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(4): 524686-524686. |
[6] | XIANG Xiaojia, YAN Chao, WANG Chang, YIN Dong. Coordination control method for fixed-wing UAV formation through deep reinforcement learning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(4): 524009-524009. |
[7] | ZHOU Kai, WEI Ruixuan, ZHANG Qirui, DING Chao. Learning method for autonomous air combat based on experience transfer [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(S2): 724285-724285. |
[8] | DONG Yiqun, AI Jianliang. Decision making in autonomous air combat: Review and prospects [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(S2): 724264-724264. |
[9] | YANG Wei. Development of future fighters [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(6): 524377-524377. |
[10] | CHEN Bin, WANG Jiang, WANG Yang. Intelligent virtual training partner in embedded training system of fighter [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(6): 523467-523467. |
[11] | LIU Bingyan, YE Xiongbing, ZHOU Chifei, LIU Biliu. Allocation of composite mode on-orbit service resource based on improved DQN [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(5): 323630-323630. |
[12] | ZHANG Yaozhong, XU Jialin, YAO Kangjia, LIU Jieling. Pursuit missions for UAV swarms based on DDPG algorithm [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(10): 324000-324000. |
[13] | LIU Bingyan, YE Xiongbing, GAO Yong, WANG Xinbo, NI Lei. Strategy solution of non-cooperative target pursuit-evasion game based on branching deep reinforcement learning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(10): 324040-324040. |
[14] | ZHANG Jing, HE You, PENG Yingning, LI Gang. Neural network and artificial potential field based cooperative and adversarial path planning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2019, 40(3): 322493-322493. |
[15] | ZUO Jialiang, YANG Rennong, ZHANG Ying, LI Zhonglin, WU Meng. Intelligent decision-making in air combat maneuvering based on heuristic reinforcement learning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2017, 38(10): 321168-321168. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
Address: No.238, Baiyan Buiding, Beisihuan Zhonglu Road, Haidian District, Beijing, China
Postal code : 100083
E-mail:hkxb@buaa.edu.cn
Total visits: 6658907 Today visits: 1341All copyright © editorial office of Chinese Journal of Aeronautics
All copyright © editorial office of Chinese Journal of Aeronautics
Total visits: 6658907 Today visits: 1341