ACTA AERONAUTICAET ASTRONAUTICA SINICA >
Intelligent decision-making in air combat maneuvering based on heuristic reinforcement learning
Received date: 2017-02-06
Revised date: 2017-04-28
Online published: 2017-04-28
Intelligent decision-making air combat maneuvering has been a research hotspot all the time.Current research on the air combat mainly uses optimization theory and algorithm of traditional artificial intelligence to compute the air combat decision sequence in the relative fixed environment.However,the process of the air combat is dynamic and thus contains many uncertain elements.It is thus difficult to obtain the decision sequence that is tally with the actual conditions of the air combat by using the traditional theoretical methods.A new method for intelligent decision-making in air combat maneuvering based on heuristic reinforcement learning is proposed in this paper.The "trial and error learning" method is adopted to compute the relative better air combat decision sequence in the dynamic air combat,and the neural network is used to learn the process of the reinforcement learning at the same time to accumulate knowledge and inspire the search process of the reinforcement learning.The search efficiency is increased to a great extent,and real-time dynamic computation of the decision sequence during the air combat is realized.Experiment results indicate that the decision sequence conforms to actual conditions.
ZUO Jialiang , YANG Rennong , ZHANG Ying , LI Zhonglin , WU Meng . Intelligent decision-making in air combat maneuvering based on heuristic reinforcement learning[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2017 , 38(10) : 321168 -321168 . DOI: 10.7527/S1000-6893.2017.321168
[1] NICHOLAS E, DAVID C, COREY S, et al. Genetic fuzzy based artificial intelligence for unmanned combat aerial vehicle control in simulated air combat missions[J]. Journal of Defense Management, 2016, 6(1):1-7.
[2] YIN Y, GONG G, HAN L. An approach to pilot air-combat behavior assessment[J]. Procedia Engineering, 2011, 15:4036-4040.
[3] 傅莉, 谢福怀. 基于滚动时域的无人机空战决策专家系统[J]. 北京航空航天大学学报, 2015, 41(11):1994-1999. FU L, XIE F H. Real-time path planning to track moving target in complex environment for UAV[J]. Journal of Beijing University of Aeronautics and Astronautics, 2015, 41(11):1994-1999(in Chinese).
[4] 傅莉, 王晓光. 无人战机近距空战微分对策建模研究[J]. 兵工学报, 2012, 33(10):1210-1216. FU L, WANG X G. Research on close air combat modeling of differential games for unmanned combat air vehicles[J]. Acta Armamentarii, 2012, 33(10):1210-1216(in Chinese).
[5] SU M C, LAI S C. A new approach to multi-aircraft air combat assignments[J]. Swarm and Evolutionary Computation, 2012(6):39-46.
[6] 张涛, 于雷, 周中良, 等. 基于混合算法的空战机动决策[J]. 系统工程与电子技术, 2013, 35(7):1445-1450. ZHANG T, YU L, ZHOU Z L, et al. Decision-making for air combat maneuvering based on hybrid algorithm[J]. Systems Engineering and Electronics, 2013, 35(7):1445-1450(in Chinese).
[7] 左家亮, 杨任农. 基于模糊聚类的近距空战决策过程与评估[J]. 航空学报, 2015, 36(5):1650-1660. ZUO J L, YANG R N. Reconstruction and evaluation of close air combat decision-making process based on fuzzy clustering[J]. Acta Aeronautica et Astronautica Sinica, 2015, 36(5):1650-1660(in Chinese).
[8] RUAN C W, ZHOU Z L. Task assignment under constraint of timing sequential for cooperative air combat[J]. Journal of Systems Engineering and Electronics, 2016, 27(4):836-844.
[9] 康冰, 王曦辉, 刘富. 基于改进蚁群算法的搜索机器人路径规划[J]. 吉林大学学报(工学版), 2014, 44(4):1062-1068. KANG B, WANG X H, LIU F. Path planning of searching robot based on improved ant colony algorithm[J]. Journal of Jilin University (Engineering and Technology Edition), 2014, 44(4):1062-1068(in Chinese).
[10] 梁宵, 王宏伦, 曹梦磊, 等. 无人机复杂环境中跟踪运动目标的实时航路规划[J]. 北京航空航天大学学报, 2012, 38(9):1129-1133. LIANG X, WANG H L, CAO M L, et al. Real-time path planning to track moving target in complex environment for UAV[J]. Journal of Beijing University of Aeronautics and Astronautics, 2012, 38(9):1129-1133(in Chinese).
[11] SUTTON R S, BARTO A G. Introduction to reinforcement learning[M]. Cambridge:MIT Press, 1988.
[12] LIU C, XU X, HU D. Multi-objective reinforcement learning:A comprehensive overview[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C:Application and Reviews, 2013, 99(4):1-13.
[13] 陈兴国, 俞扬. 强化学习及其在电脑围棋中的应用[J]. 自动化学报, 2016, 42(5):685-695. CHEN X G, YU Y. Reinforcement learning and its application to game of go[J]. Acta Automatica Sinica, 2016, 42(5):685-695(in Chinese).
[14] 薛羽, 庄毅. 基于启发式自适应离散差分进化算法的多UCAV协同干扰空战决策[J]. 航空学报, 2013, 34(2):343-351. XUE Y, ZHANG Y. Multiple UCAV cooperative jamming air combat decision making based on heuristic self-adaptive discrete differential algorithm[J]. Acta Aeronautica et Astronautica Sinca, 2013, 34(2):343-351(in Chinese).
[15] BIANCHI R A C, RIBEIRO C H C, COSTA A H R. Accelerating autonomous learning by using heuristic selection of actions[J]. Journal of Heuristics, 2008, 14(2):135-168.
[16] DIETTERICH T G. Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. Journal of Artificial Intelligence Research, 2000(13):227-303.
[17] AUSTIN F, CARBONE G, FALCO M. Automated maneuvering during air-to-air combat:RE-742[R]. Bethpage, NY:Grumman Corporate Research Center,1990.
/
〈 | 〉 |