飞机主动防御模式下改进逆强化学习的来袭弹轨迹预测方法

doi:10.7527/S1000-6893.2025.32753

Abstract

Abstract:

With the advancement of aircraft fire control systems and situational awareness capabilities， defense strategies against air-to-air missiles are evolving from passive methods such as jamming and deception to active defense modes involving interceptor missiles countering incoming threats. However， the low average velocity， limited defense space， and insufficient overload ratio of interceptor missiles make it difficult for traditional proportional navigation guidance to meet the precise collision requirements， posing new challenges for trajectory prediction of incoming missiles. To achieve the high-probability prediction of guidance information for interceptor missiles in a three-body active defense scenario involving the carrier aircraft， incoming missile， and interceptor missile， this paper provides an incoming missile trajectory prediction method based on inverse reinforcement learning. First， a mathematical model is constructed to extract the temporal maneuvering characteristics of incoming missiles under the principle of maximum causal entropy， and a behavioral strategy library for the guidance law of incoming missiles is established within the inverse reinforcement learning framework. Then， a quadratic-based calculation formula for the inverse reinforcement learning strategy function is derived， reducing the computational complexity of the strategy function in high-dimensional states. Finally， the weighting coefficients of the strategy function are computed online using rolling window measurement data， enabling real-time optimization and adaptive weighted trajectory prediction distribution to form a real-time prediction model for incoming missile trajectories. Simulation results demonstrate that in the three-body active defense context， the proposed trajectory prediction network algorithm exhibits strong generalization capability in “out-of-model-set/sample-set” scenarios， good dynamic adaptability to complex target maneuvers， and high prediction accuracy. The method provides a high-probability trajectory prediction model suitable for guidance in defense， and thus has notable theoretical significance and engineering application value.

Key words: three-body active defense scenario, guided missiles, active defense, inverse reinforcement learning, trajectory prediction

CLC Number:

V249.1

Hao ZHANG, Jianing LIU, Zhi XU, Yuanxin YANG. Trajectory prediction method of incoming missiles based on improved inverse reinforcement learning in aircraft active defense mode[J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(8): 332753.

Figures/Tables 30

Fig.1

Fig.2

Table 1

Fig.3

Table 2

Table 3

Fig.4

Fig.5

Fig.6

Fig.7

Fig.8

Fig.9

Fig.10

Fig.11

Fig.12

Fig.13

Fig.14

Table 4

Fig.15

Fig.16

Fig.17

Fig.18

Fig.19

Table 5

Designed working conditions for algorithm assessment

工况类型	关键参数与触发逻辑	备注
工况1	仿真条件设置为典型三体主动防御场景（参照表2文字部分）	其他同工况1
工况2	载机采取10g/12.5g/15g的S型极端逃逸机动
工况3	来袭弹制导策略在飞行途中发生切换，且切换条件为 $r A T > R A T, R A T ∼ U 0.5 r A T 0, 0.8 r A T 0$

Table 5

Fig.20

Fig.21

Fig.22

Table A1

Assumed incoming missile guidance law type

制导律类型j	参考速度矢量L_j
TPN	$L T P N = Δ r ˙ 0 e r$
RTPN	$L R T P N = Δ r ˙ e r$
GTPN	$L G T P N = Δ r ˙ 0 e r + Δ r 0 ω s 0 e θ$
IPN	$L I P N = Δ r ˙ e r + Δ r ω s e θ$
PPN	$L P P N = - v T$
OPN	$L O P N = u o p t θ e r + v o p t θ e θ$

Table A1

Fig.A1

Table C1

References 30

[1]	毕鹏，陈永鹏，祝雯生，等. 机载主动防御系统毁伤技术发展现状及趋势［J］. 空天防御， 2024， 7（4）： 67-72.
	BI P， CHEN Y P， ZHU W S， et al. Development status and trend of countermeasure technology of airborne active protection system［J］. Air&Space DEFENSE， 2024， 7（4）： 67-72 （in Chinese）.
[2]	乔要宾，吴震，吕明远. 空中平台主动防御系统发展现状及关键技术［J］. 航空兵器， 2023， 30（2）： 77-82.
	QIAO Y B， WU Z， LYU M Y. Development status and key technologies of air platform active defense system［J］. Aero Weaponry， 2023， 30（2）： 77-82 （in Chinese）.
[3]	纪毅，王伟，张宏岩，等. 面向高机动目标拦截任务的空空导弹制导方法综述［J］. 航空兵器， 2022， 29（6）： 15-25.
	JI Y， WANG W， ZHANG H Y， et al. A survey on guidance method of air-to-air missiles facing high maneuvering targets［J］. Aero Weaponry， 2022， 29（6）： 15-25 （in Chinese）.
[4]	陈维义，何凡，李逸源，等. 三体对抗中的主动防御鲁棒最优预测制导律研究［J］. 北京理工大学学报， 2024， 44（6）： 645-654.
	CHEN W Y， HE F， LI Y Y， et al. Robust optimal predictive guidance law for active defense in three-body confrontation［J］. Transactions of Beijing Institute of Technology， 2024， 44（6）： 645-654 （in Chinese）.
[5]	雷虎民，骆长鑫，周池军，等. 临近空间防御作战拦截弹制导与控制关键技术综述［J］. 航空兵器， 2021， 28（2）： 1-10.
	LEI H M， LUO C X， ZHOU C J， et al. Summary of key technologies of interceptor guidance and control in near space defense operations［J］. Aero Weaponry， 2021， 28（2）： 1-10 （in Chinese）.
[6]	陈文雪，胡玉东，高长生，等. 拦截高超声速滑翔飞行器：制导进展与展望［J］. 宇航学报， 2024， 45（6）： 799-814.
	CHEN W X， HU Y D， GAO C S， et al. Intercepting hypersonic glide vehicle： progress and prospect of guidance technology［J］. Journal of Astronautics， 2024， 45（6）： 799-814 （in Chinese）.
[7]	肖惟，于江龙，董希旺，等. 过载约束下的大机动目标协同拦截［J］. 航空学报， 2020， 41（S1）： 723777.
	XIAO W， YU J L， DONG X W， et al. Cooperative interception against highly maneuvering target with acceleration constraints［J］. Acta Aeronautica et Astronautica Sinica， 2020， 41（S1）： 723777 （in Chinese）.
[8]	谭一廷，荆武兴，高长生，等. 高超声速机动目标拦截多约束解析捕获区［J］. 航空学报， 2023， 44（22）： 328436.
	TAN Y T， JING W X， GAO C S， et al. Multiple constrained analytical capture region for hypersonic maneuvering target interception［J］. Acta Aeronautica et Astronautica Sinica， 2023， 44（22）： 328436 （in Chinese）.
[9]	张浩，张奕群，张鹏飞. 三体对抗中的制导控制研究方法综述［J］. 战术导弹技术， 2021（1）： 67-73， 83.
	ZHANG H， ZHANG Y Q， ZHANG P F. A survey of guidance law design in active target defense scenario［J］. Tactical Missile Technology， 2021（1）： 67-73， 83 （in Chinese）.
[10]	史恒，朱纪洪. 主动防御的最优预测协同制导律研究［J］. 空间控制技术与应用， 2019， 45（4）： 64-70.
	SHI H， ZHU J H. Optimal cooperative prediction guidance law for active defense［J］. Aerospace Control and Application， 2019， 45（4）： 64-70 （in Chinese）.
[11]	FONOD R， SHIMA T. Multiple model adaptive evasion against a homing missile［J］. Journal of Guidance， Control， and Dynamics， 2016， 39（7）： 1578-1592.
[12]	姜易阳，陈万春. 基于DGL/IMM算法的随机机动弹头拦截研究［J］. 弹箭与制导学报， 2012， 32（2）： 6-10.
	JIANG Y Y， CHEN W C. Ballistic missile defense against random maneuvering targets based on DGL/IMM algorithm［J］. Journal of Projectiles， Rockets， Missiles and Guidance， 2012， 32（2）： 6-10 （in Chinese）.
[13]	杜润乐，刘佳琪，李志峰，等. 低通滤波与卡尔曼滤波相结合的制导律识别［J］. 哈尔滨工业大学学报， 2017， 49（4）： 66-72.
	DU R L， LIU J Q， LI Z F， et al. A LPF enhanced adaptive Kalman filter for guidance law recognition［J］. Journal of Harbin Institute of Technology， 2017， 49（4）： 66-72 （in Chinese）.
[14]	王晓芳，张楠. 基于信号分解的防御弹制导律辨识方法［J］. 战术导弹技术， 2024（1）： 95-104.
	WANG X F， ZHANG N. A method of guidance law identification for defense missile based on signal decomposition［J］. Tactical Missile Technology， 2024（1）： 95-104 （in Chinese）.
[15]	袁则华，崔颢，徐琰珂，等. 基于LSTM神经网络的来袭导弹制导律识别方法研究［J］. 航空兵器， 2024， 31（6）： 57-63.
	YUAN Z H， CUI H， XU Y K， et al. Research on guidance law recognition method of incoming missile based on LSTM neural network［J］. Aero Weaponry， 2024， 31（6）： 57-63 （in Chinese）.
[16]	WANG Y H， WANG J， FAN S P. Parameter identification of a PN-guided incoming missile using an improved multiple-model mechanism［J］. IEEE Transactions on Aerospace and Electronic Systems， 2023， 59（5）： 5888-5899.
[17]	XU H， LIU Y J， XING Y Z， et al. Lateral maneuver discrimination for hypersonic glide vehicles： a hybrid approach combining model-driven and data-driven methods［J］. IEEE Sensors Journal， 2024， 24（7）： 11425-11437.
[18]	REN J H， WU X， LIU Y， et al. Long-term trajectory prediction of hypersonic glide vehicle based on physics-informed transformer［J］. IEEE Transactions on Aerospace and Electronic Systems， 2023， 59（6）： 9551-9561.
[19]	SNOSWELL A J， SINGH S P N， YE N. Revisiting maximum entropy inverse reinforcement learning： New perspectives and algorithms［C］∥2020 IEEE Symposium Series on Computational Intelligence （SSCI）. Piscataway： IEEE Press， 2020： 241-249.
[20]	ZIEBART B D， MAAS A L， BAGNELL J A， et al. Maximum entropy inverse reinforcement learning［C］∥AAAI Conference on Artificial Intelligence. Washington， D.C.： AAAI， 2008： 1433-1438.
[21]	颜鹏，郭继峰，白成超. 考虑移动目标不确定行为方式的轨迹预测方法［J］. 宇航学报， 2022， 43（8）： 1040-1051.
	YAN P， GUO J F， BAI C C. A trajectory prediction method considering uncertain behavior patterns of moving targets［J］. Journal of Astronautics， 2022， 43（8）： 1040-1051 （in Chinese）.
[22]	YANG B， LU Y N， WAN R， et al. Meta-IRLSOT++： A meta-inverse reinforcement learning method for fast adaptation of trajectory prediction networks［J］. Expert Systems with Applications， 2024， 240： 122499.
[23]	李银通，韩统，孙楚，等. 基于逆强化学习的空战态势评估函数优化方法［J］. 火力与指挥控制， 2019， 44（8）： 101-106.
	LI Y T， HAN T， SUN C， et al. An optimization method of air combat situation assessment function based on inverse reinforcement learning［J］. Fire Control & Command Control， 2019， 44（8）： 101-106 （in Chinese）.
[24]	岳承磊，汪雪川，岳晓奎，等. 基于逆强化学习的航天器交会对接方法［J］. 航空学报， 2023， 44（19）： 328420.
	YUE C L， WANG X C， YUE X K， et al. A spacecraft rendezvous and docking method based on inverse reinforcement learning［J］. Acta Aeronautica et Astronautica Sinica， 2023， 44（19）： 328420 （in Chinese）.
[25]	Levine S， Koltun V. Continuous inverse optimal control with locally optimal examples［C］∥Proceedings of the 29th International Coference on International Conference on Machine Learning. Madison： Omnipress， 2012： 475-482.
[26]	KIM J， YANG I. Maximum entropy optimal control of continuous-time dynamical systems［J］. IEEE Transactions on Automatic Control， 2023， 68（4）： 2018-2033.
[27]	BOYD S P， VANDENBERGHE L. Convex optimization［M］. Cambridge： Cambridge University Press， 2004： 226-227.
[28]	梁津鑫，张晓阳，崔颢，等. 雷达/红外抗干扰融合跟踪方法研究［J］. 航空兵器， 2025， 32（4）： 88-94.
	LIANG J X， ZHANG X Y， CUI H， et al. Research on radar/IR anti-interference fusion tracking methods［J］. Aero Weaponry， 2025， 32（4）： 88-94 （in Chinese）.
[29]	LIM B， ARıK S Ö， LOEFF N， et al. Temporal Fusion Transformers for interpretable multi-horizon time series forecasting［J］. International Journal of Forecasting， 2021， 37（4）： 1748-1764.
[30]	PENTSOS V， TRAGOUDAS S， WIBBENMEYER J， et al. A hybrid LSTM-transformer model for power load forecasting［J］. IEEE Transactions on Smart Grid， 2025， 16（3）： 2624-2634.

机动动作类型	机动描述
匀速定高	匀速直线飞行
S型机动	过载大小恒定的S型机动
掉头置尾	过载大小恒定的高度下降U型机动

初始状态	数值
弹目距离/km	200
高度/km	10
主动段结束初速	Ma=8
射、高、侧向位置散布/km	±20、±2、±20
弹道倾角、偏角散布/（°）	±3、±20

观测量	噪声量级（1σ标准差）
相对纵程/m	10
相对高度/m	8
相对横程/m	8
纵程相对速度/（m·s^-1）	1.0
高度相对速度/（m·s^-1）	0.5
横程相对速度/（m·s^-1）	0.5

载机机动形式	预测算法	10 s内/m	40 s内/m
匀速直线	IMM滤波辨识算法	274.213 1	550.465
	AIRL方法	218.457	439.824
	LSTM-Transformer模型	211.052	424.968
	所提算法	162.065	394.599
S型机动	IMM滤波辨识算法	258.143	677.651
	AIRL方法	202.556	561.726
	LSTM-Transformer模型	183.210	507.288
	所提算法	157.624	494.523
掉头置尾	IMM滤波辨识算法	234.227	501.584
	AIRL方法	213.385	452.283
	LSTM-Transformer模型	208.617	431.499
	所提算法	170.706	393.720

模型类型	网络模块	输入内容	输出内容	隐藏层配置	特殊结构 / 机制
AIRL	生成器	对应式（4）状态量（相对位置、速度矢量）	来袭弹控制量（对应式（5））中各元素的均值及方差	3 层全连接隐藏层，每层 128 个神经元
AIRL	判别器	状态量（相对位置、速度矢量）与控制量拼接的向量	单元素判别值（用于区分生成器轨迹与专家轨迹），同时为生成器提供奖励信号	3 层全连接隐藏层，每层 128 个神经元
LSTM-Transformer	编码器	50 帧含来袭弹位置、速度的历史序列	输入序列的上下文特征向量	2层LSTM 层，每层128个神经元
	解码器	编码器输出的上下文特征向量	下一时刻来袭弹状态的中间预测特征	2层标准Transformer解码器，每层含8头自注意力子层（每个注意力头维度16），其他隐层维度128	采取因果掩码自注意力机制（保证自回归特性）
	输出层	解码器输出的中间预测特征	下一时刻来袭弹三轴位置、速度（共6个输出维度）	1 层全连接层，输出大小为 6	多步预测时采用滚动方式（引入上一时刻网络输出作为输入）

Trajectory prediction method of incoming missiles based on improved inverse reinforcement learning in aircraft active defense mode

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 30

References 30

Related Articles 9

Recommended Articles

Metrics

Comments

[1]	Shusheng CHEN, Muliang JIA, Jiahao LIN, Shiyi JIN, Zhenghong GAO, Yueqing WANG, Zhiqiang MA, Zheng LI, Chenlong DUAN, Jiawei LI. Empowering aircraft technology applications with generative models: Research progress and prospects [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(10): 631194-631194.
[2]	Qingcheng WAN, Meng YU, Yubao LI, Yin WANG. Intelligent guidance algorithm for target hit point branch prediction for head-on interception [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(S1): 730873-730873.
[3]	Qian ZHANG, Guanwei YAN, Qin NIE, Ruihai CHEN, Jianing LIU. Aircraft-missile cooperative guidance method based on trajectory numerical optimization of long-range air-to-air missiles [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(17): 530138-530138.
[4]	Baichuan ZHANG, Wenhao BI, An ZHANG, Zeming MAO, Mi YANG. Transformer-based error compensation method for air combat aircraft trajectory prediction [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(9): 327413-327413.
[5]	Chenglei YUE, Xuechuan WANG, Xiaokui YUE, Ting SONG. A spacecraft rendezvous and docking method based on inverse reinforcement learning [J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(19): 328420-328420.
[6]	Bing WANG, Runyuan ZOU, Zhening CHANG. Aircraft takeoff mass estimation method based on improved simulated annealing algorithm [J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(16): 328090-328090.
[7]	Haojian LI, Yuanhe LIU, Yangang LIANG, Kebo LI. Prescribed performance guidance law with field-of-view and impact angle constraints [J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(15): 528764-528764.
[8]	XI Zhifei, XU An, KOU Yingxin, LI Zhanwu, YANG Aiwu. Target maneuver trajectory prediction based on Volterra series identified by improved particle swarm algorithm [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(12): 324183-324183.
[9]	Chen Feng;Xiao Yelun;Chen Wanchun. Guidance Based on Zero Effort Miss for Super-range Exoatmospheric Intercept [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2009, 30(9): 1583-1589.