面向多战术需求的无人机空战自主规避机动方法
收稿日期: 2024-04-30
修回日期: 2024-05-31
录用日期: 2024-06-28
网络出版日期: 2024-07-12
基金资助
国家自然科学基金(62006193);陕西省重点研发计划(2024GX-YBXM-115);航空科学基金(2022Z023053001);中央高校基本科研业务费专项资金(D5000230150)
Autonomous evasive maneuver method for unmanned combat aerial vehicle in air combat with multiple tactical requirements
Received date: 2024-04-30
Revised date: 2024-05-31
Accepted date: 2024-06-28
Online published: 2024-07-12
Supported by
National Natural Science Foundation of China(62006193);Key Research and Development Program of Shaanxi Province(2024GX-YBXM-115);Aeronautical Science Foundation of China(2022Z023053001);Fundamental Research Funds for the Central Universities(D5000230150)
空战通常是一个连续且包含多回合导弹攻防对抗的过程,UCAV在规避来袭空空导弹的过程中应综合考虑机动对整个空战对抗任务的影响,而不是仅关注安全性因素。对此,提出了脱靶量、耗能以及终端态势优势等多战术需求条件下的UCAV空战自主规避机动方法。建立了UCAV-导弹三维追逃模型以及UCAV自主规避的状态空间、动作空间和奖励函数模型,针对该模型提出了LSTM-Dueling DDQN算法,该算法融合Double DQN和Dueling DQN网络模型,并使用LSTM网络提取时序特征。基于探索课程学习思想,对稠密与稀疏奖励函数进行时序融合,促进人工经验和策略探索对学习过程的共同引导。此外,引入切比雪夫方法求解面向不同战术需求偏重程度的Pareto策略解集,以反映多种战术需求的矛盾性与耦合性。仿真实验与结果分析表明:所提方法具有良好的收敛速度和学习效果,对解决多战术需求条件下空战自主规避机动问题的可行性与有效性显著,所得的规避机动策略能够在保证UCAV自身安全性的同时反应出不同的规避战术需求。
关键词: 空战机动; 自主规避; 战术需求; UCAV; LSTM-Dueling DDQN
杨振 , 李琳 , 柴仕元 , 黄吉传 , 朴海音 , 周德云 . 面向多战术需求的无人机空战自主规避机动方法[J]. 航空学报, 2024 , 45(20) : 630629 -630629 . DOI: 10.7527/S1000-6893.2024.30629
Air combat is usually a continuous process involving multiple rounds of missile confrontation. Unmanned Combat Aerial Vehicle (UCAV) should comprehensively consider the impact of maneuvering on the entire air combat mission in the process of evading incoming air-to-air missiles, instead of focusing only on safety factors. In this paper, a UCAV autonomous evasive maneuver method is proposed under the condition of multi-tactical requirements such as miss distance, energy consumption and terminal superiority. A three-dimensional pursuit and escape model of UCAV-missile and a model for the state space, action space and reward function of UCAV autonomous evasion are established. An algorithm based on the LSTM-Dueling DDQN (Long Short-Term Memory-Dueling Double Deep-Q Network) is proposed for this model. The algorithm fuses Double DQN and Dueling DQN network models, and uses LSTM network to extract timing features. Based on the concept of exploratory course learning, temporal fusion of dense and sparse reward functions is carried out to promote joint guidance of artificial experience and strategy exploration in the process of maneuver learning. The Chebyshev method is introduced to solve the Pareto solution set for different degree of tactical demands, so as to reflect the contradiction and coupling of multiple tactical requirements. Simulation experiments and result analysis show that the proposed method has good convergence speed and learning effect, and is feasible and effective to solve the problem of autonomous evasive maneuver in air combat under multiple tactical requirements. The obtained evasive maneuvers can reflect different evasive tactical requirements while ensuring UCAV’s own safety.
1 | IMADO F, MIWA S. Missile guidance algorithm against high-g barrel roll maneuvers[J]. Journal of Guidance, Control, and Dynamics, 1994, 17(1): 123-128. |
2 | IMADO F, UEHARA S. High-g barrel roll maneuvers against proportional navigation from optimal control viewpoint[J]. Journal of Guidance, Control, and Dynamics, 1998, 21(6): 876-881. |
3 | AKDAG R, ALTILAR T. A comparative study on practical evasive maneuvers against proportional navigation missiles[C]∥ Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit. Reston: AIAA, 2005. |
4 | AKDAG R, ALTILAR T. Modeling evasion tactics of a fighter against missiles in three dimen-sions[C]∥Proceedings of the AIAA Guidance, Naviga-tion, and Control Conference and Exhibit. Reston: AIAA, 2006. |
5 | YOMCHINDA T. A study of autonomous evasive planar-maneuver against proportional-navigation guidance missiles for unmanned aircraft[C]∥2015 Asian Conference on Defence Technology (ACDT). Piscataway: IEEE Press, 2015: 210-214. |
6 | CARR R W, COBB R G, PACHTER M, et al. Solution of a pursuit-evasion game using a near-optimal strategy[J]. Journal of Guidance, Control, and Dynamics, 2018, 41(4): 841-850. |
7 | IMADO F, KURODA T. Engagement tactics for two missiles against an optimally maneuvering aircraft[J]. Journal of Guidance, Control, and Dynamics, 2011, 34(2): 574-582. |
8 | KARELAHTI J, VIRTANEN K. Adaptive controller for the avoidance of an unknownly guided air combat missile[C]∥ 2007 46th IEEE Conference on Decision and Control. Piscataway: IEEE Press, 2007: 1306-1313. |
9 | KARELAHTI J, VIRTANEN K, RAIVIO T. Near-optimal missile avoidance trajectories via receding horizon control[J]. Journal of Guidance, Control, and Dynamics, 2007, 30(5): 1287-1298. |
10 | 李飞, 于雷, 周中良, 等. 战斗机末端机动的非线性模型预测控制规避策略[J]. 国防科技大学学报, 2014, 36(3): 83-90. |
LI F, YU L, ZHOU Z L, et al. The nonlinear model predictive control avoidance strategy of the fighter maneuver in endgame[J]. Journal of National University of Defense Technology, 2014, 36(3): 83-90 (in Chinese). | |
11 | SINGH L. Autonomous missile avoidance using nonlinear model predictive control[C]∥Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit. Reston: AIAA, 2004. |
12 | HORIE K, CONWAY B A. Optimal fighter pursuit-evasion maneuvers found via two-sided optimization[J]. Journal of Guidance, Control, and Dynamics: A Publication of the American Institute of Aeronautics and Astronautics Devoted to the Technology of Dynamics and Control, 2006; 29(1): 105-112. |
13 | IMADO F, KURODA T. Family of local solutions in a missile-aircraft differential game[J]. Journal of Guidance, Control, and Dynamics, 2011, 34(2): 583-591. |
14 | ALKAHER D, MOSHAIOV A. Game-based safe aircraft navigation in the presence of energy-bleeding coasting missile[J]. Journal of Guidance Control Dynamics, 2016, 39(7): 1539-1550. |
15 | KARELAHTI J, VIRTANEN K, RAIVIO T. Game optimal support time of a medium range air-to-air missile[J]. Journal of Guidance, Control, and Dynamics, 2006, 29(5): 1061-1069. |
16 | NUSYIRWAN I, BIL C. Factorial analysis of a real time optimisation for pursuit-evasion problem[C]∥Proceedings of the 46th AIAA Aerospace Sciences Meeting and Exhibit. Reston: AIAA, 2008. |
17 | WANG X P, LIN Q Y, DONG X M. Aircraft evasive maneuver trajectory optimization based on QPSO[C]∥ International Congress on Ultra Modern Telecommunications and Control Systems. Piscataway: IEEE Press, 2010: 416-420. |
18 | 杨曦中, 艾剑良. 自主空战中无人机规避导弹机动策略研究[J]. 系统仿真学报, 2018, 30(5): 1957-1966. |
YANG X Z, AI J L. Evasive maneuvers against missiles for unmanned combat aerial vehicle in autonomous air combat[J]. Journal of System Simulation, 2018, 30(5): 1957-1966 (in Chinese). | |
19 | WANG M L, WANG L X, YUE T. An application of continuous deep reinforcement learning approach to pursuit-evasion differential game[C]∥2019 IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). Piscataway: IEEE Press, 2019: 1150-1156. |
20 | ZHANG H P, HUANG C Q. Maneuver decision-making of deep learning for UCAV thorough azimuth angles[J]. IEEE Access, 2976, 8: 12976-12987. |
21 | KONG W R, ZHOU D Y, YANG Z, et al. Maneuver strategy generation of UCAV for within visual range air combat based on multi-agent reinforcement learning and target position prediction[J]. Applied Sciences, 2020, 10(15): 5198. |
22 | XUE J J, ZHU J, XIAO J Y, et al. Panoramic convolutional long short-term memory networks for combat intension recognition of aerial targets[J]. IEEE Access, 2020, 8: 183312-183323. |
23 | KONG W R, ZHOU D Y, YANG Z, et al. UAV autonomous aerial combat maneuver strategy generation with observation error based on state-adversarial deep deterministic policy gradient and inverse reinforcement learning[J]. Electronics, 2020, 9(7): 1121. |
24 | YANG Z, ZHOU D Y, ZHAO Y Y, et al. Tactical preference based objectives for solving an evasion problem of fighter in air combat[J]. Journal of Physics: Conference Series, 2020, 1570(1): 012024. |
25 | YANG Z, ZHOU D Y, KONG W R, et al. Nondominated maneuver strategy set with tactical requirements for a fighter against missiles in a dogfight[J]. IEEE Access, 2020, 8: 117298-117312. |
26 | CARR R W, COBB R. An energy based objective for solving an optimal missile evasion problem[C]∥Proceedings of the AIAA Guidance, Navigation, and Control Conference. Reston, Virginia: AIAA, 2017: AIAA 2017-1016. |
27 | SUTTON R S, BARTO A G. 强化学习[M]. 第二版. 北京: 电子工业出版社, 2019 |
SUTTON R S, BARTO A G. Reinforcement learning [M]. 2nd.ed. Beijing: Publishing House of Electronics Industry, 2019. | |
28 | MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518: 529-533. |
29 | VAN HASSELT H, GUEZ A, SILVER D. Deep reinforcement learning with double Q-learning[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2016, 30(1): 2094-2100. |
30 | WANG Z Y, SCHAUL T, HESSEL M, et al. Dueling network architectures for deep reinforcement learning[DB/OL]. arXiv preprint: 1511.06581, 2015. |
31 | HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. |
/
〈 |
|
〉 |