电子电气工程与控制

飞机主动防御模式下改进逆强化学习的来袭弹轨迹预测方法

  • 张皓 ,
  • 刘家宁 ,
  • 许志 ,
  • 杨垣鑫
展开
  • 西北工业大学 航天学院,西安 710129
E-mail: xuzhi@nwpu.edu.cn

收稿日期: 2025-09-03

  修回日期: 2025-09-18

  录用日期: 2025-11-08

  网络出版日期: 2025-11-20

Trajectory prediction method of incoming missiles based on improved inverse reinforcement learning in aircraft active defense mode

  • Hao ZHANG ,
  • Jianing LIU ,
  • Zhi XU ,
  • Yuanxin YANG
Expand
  • School of Astronautics,Northwestern Polytechnical University,Xi’an 710129,China
E-mail: xuzhi@nwpu.edu.cn

Received date: 2025-09-03

  Revised date: 2025-09-18

  Accepted date: 2025-11-08

  Online published: 2025-11-20

摘要

随着飞机火控系统和态势感知能力的提升,防御空空导弹的策略正由干扰、欺骗等被动方式,向拦截弹反来袭弹的主动防御模式演变,然而拦截弹的平均速度低、防御空间小,过载比不足以支撑传统比例导引法的精确碰撞要求,对来袭弹的轨迹预测提出了新的挑战。针对载机、来袭弹和拦截弹三体主动防御场景中拦截弹制导信息的高概率预测问题,提出了基于逆强化学习的来袭弹轨迹预测方法。首先,构建最大因果熵下对来袭弹机动时序特征提取的数学模型,并基于逆强化学习框架建立了来袭弹制导律的行为策略函数库;然后,推导了基于二次型的逆强化学习策略函数计算公式,降低了高维状态下策略函数计算的复杂度;最后,基于滚动窗口测量数据在线计算策略函数的加权系数,实时择优及自适应加权轨迹预测分布,形成来袭弹轨迹的实时预测模型。仿真验证表明,在三体主动防御场景下所提轨迹预测网络算法在“模型集/样本集外”具有强泛化能力,对复杂目标机动的动态适应性好、预测精度高,为防御来袭弹提供了可供制导使用的高概率轨迹预测模型,具有一定的理论应用意义和工程参考价值。

本文引用格式

张皓 , 刘家宁 , 许志 , 杨垣鑫 . 飞机主动防御模式下改进逆强化学习的来袭弹轨迹预测方法[J]. 航空学报, 2026 , 47(8) : 332753 -332753 . DOI: 10.7527/S1000-6893.2025.32753

Abstract

With the advancement of aircraft fire control systems and situational awareness capabilities, defense strategies against air-to-air missiles are evolving from passive methods such as jamming and deception to active defense modes involving interceptor missiles countering incoming threats. However, the low average velocity, limited defense space, and insufficient overload ratio of interceptor missiles make it difficult for traditional proportional navigation guidance to meet the precise collision requirements, posing new challenges for trajectory prediction of incoming missiles. To achieve the high-probability prediction of guidance information for interceptor missiles in a three-body active defense scenario involving the carrier aircraft, incoming missile, and interceptor missile, this paper provides an incoming missile trajectory prediction method based on inverse reinforcement learning. First, a mathematical model is constructed to extract the temporal maneuvering characteristics of incoming missiles under the principle of maximum causal entropy, and a behavioral strategy library for the guidance law of incoming missiles is established within the inverse reinforcement learning framework. Then, a quadratic-based calculation formula for the inverse reinforcement learning strategy function is derived, reducing the computational complexity of the strategy function in high-dimensional states. Finally, the weighting coefficients of the strategy function are computed online using rolling window measurement data, enabling real-time optimization and adaptive weighted trajectory prediction distribution to form a real-time prediction model for incoming missile trajectories. Simulation results demonstrate that in the three-body active defense context, the proposed trajectory prediction network algorithm exhibits strong generalization capability in “out-of-model-set/sample-set” scenarios, good dynamic adaptability to complex target maneuvers, and high prediction accuracy. The method provides a high-probability trajectory prediction model suitable for guidance in defense, and thus has notable theoretical significance and engineering application value.

参考文献

[1] 毕鹏, 陈永鹏, 祝雯生, 等. 机载主动防御系统毁伤技术发展现状及趋势[J]. 空天防御20247(4): 67-72.
  BI P, CHEN Y P, ZHU W S, et al. Development status and trend of countermeasure technology of airborne active protection system[J]. Air&Space DEFENSE20247(4): 67-72 (in Chinese).
[2] 乔要宾, 吴震, 吕明远. 空中平台主动防御系统发展现状及关键技术[J]. 航空兵器202330(2): 77-82.
  QIAO Y B, WU Z, LYU M Y. Development status and key technologies of air platform active defense system[J]. Aero Weaponry202330(2): 77-82 (in Chinese).
[3] 纪毅, 王伟, 张宏岩, 等. 面向高机动目标拦截任务的空空导弹制导方法综述[J]. 航空兵器202229(6): 15-25.
  JI Y, WANG W, ZHANG H Y, et al. A survey on guidance method of air-to-air missiles facing high maneuvering targets[J]. Aero Weaponry202229(6): 15-25 (in Chinese).
[4] 陈维义, 何凡, 李逸源, 等. 三体对抗中的主动防御鲁棒最优预测制导律研究[J]. 北京理工大学学报202444(6): 645-654.
  CHEN W Y, HE F, LI Y Y, et al. Robust optimal predictive guidance law for active defense in three-body confrontation[J]. Transactions of Beijing Institute of Technology202444(6): 645-654 (in Chinese).
[5] 雷虎民, 骆长鑫, 周池军, 等. 临近空间防御作战拦截弹制导与控制关键技术综述[J]. 航空兵器202128(2): 1-10.
  LEI H M, LUO C X, ZHOU C J, et al. Summary of key technologies of interceptor guidance and control in near space defense operations[J]. Aero Weaponry202128(2): 1-10 (in Chinese).
[6] 陈文雪, 胡玉东, 高长生, 等. 拦截高超声速滑翔飞行器: 制导进展与展望[J]. 宇航学报202445(6): 799-814.
  CHEN W X, HU Y D, GAO C S, et al. Intercepting hypersonic glide vehicle: progress and prospect of guidance technology[J]. Journal of Astronautics202445(6): 799-814 (in Chinese).
[7] 肖惟, 于江龙, 董希旺, 等. 过载约束下的大机动目标协同拦截[J]. 航空学报202041(S1): 723777.
  XIAO W, YU J L, DONG X W, et al. Cooperative interception against highly maneuvering target with acceleration constraints[J]. Acta Aeronautica et Astronautica Sinica202041(S1): 723777 (in Chinese).
[8] 谭一廷, 荆武兴, 高长生, 等. 高超声速机动目标拦截多约束解析捕获区[J]. 航空学报202344(22): 328436.
  TAN Y T, JING W X, GAO C S, et al. Multiple constrained analytical capture region for hypersonic maneuvering target interception[J]. Acta Aeronautica et Astronautica Sinica202344(22): 328436 (in Chinese).
[9] 张浩, 张奕群, 张鹏飞. 三体对抗中的制导控制研究方法综述[J]. 战术导弹技术2021(1): 67-73, 83.
  ZHANG H, ZHANG Y Q, ZHANG P F. A survey of guidance law design in active target defense scenario[J]. Tactical Missile Technology2021(1): 67-73, 83 (in Chinese).
[10] 史恒, 朱纪洪. 主动防御的最优预测协同制导律研究[J]. 空间控制技术与应用201945(4): 64-70.
  SHI H, ZHU J H. Optimal cooperative prediction guidance law for active defense[J]. Aerospace Control and Application201945(4): 64-70 (in Chinese).
[11] FONOD R, SHIMA T. Multiple model adaptive evasion against a homing missile[J]. Journal of Guidance, Control, and Dynamics201639(7): 1578-1592.
[12] 姜易阳, 陈万春. 基于DGL/IMM算法的随机机动弹头拦截研究[J]. 弹箭与制导学报201232(2): 6-10.
  JIANG Y Y, CHEN W C. Ballistic missile defense against random maneuvering targets based on DGL/IMM algorithm[J]. Journal of Projectiles, Rockets, Missiles and Guidance201232(2): 6-10 (in Chinese).
[13] 杜润乐, 刘佳琪, 李志峰, 等. 低通滤波与卡尔曼滤波相结合的制导律识别[J]. 哈尔滨工业大学学报201749(4): 66-72.
  DU R L, LIU J Q, LI Z F, et al. A LPF enhanced adaptive Kalman filter for guidance law recognition[J]. Journal of Harbin Institute of Technology201749(4): 66-72 (in Chinese).
[14] 王晓芳, 张楠. 基于信号分解的防御弹制导律辨识方法[J]. 战术导弹技术2024(1): 95-104.
  WANG X F, ZHANG N. A method of guidance law identification for defense missile based on signal decomposition[J]. Tactical Missile Technology2024(1): 95-104 (in Chinese).
[15] 袁则华, 崔颢, 徐琰珂, 等. 基于LSTM神经网络的来袭导弹制导律识别方法研究[J]. 航空兵器202431(6): 57-63.
  YUAN Z H, CUI H, XU Y K, et al. Research on guidance law recognition method of incoming missile based on LSTM neural network[J]. Aero Weaponry202431(6): 57-63 (in Chinese).
[16] WANG Y H, WANG J, FAN S P. Parameter identification of a PN-guided incoming missile using an improved multiple-model mechanism[J]. IEEE Transactions on Aerospace and Electronic Systems202359(5): 5888-5899.
[17] XU H, LIU Y J, XING Y Z, et al. Lateral maneuver discrimination for hypersonic glide vehicles: a hybrid approach combining model-driven and data-driven methods[J]. IEEE Sensors Journal202424(7): 11425-11437.
[18] REN J H, WU X, LIU Y, et al. Long-term trajectory prediction of hypersonic glide vehicle based on physics-informed transformer[J]. IEEE Transactions on Aerospace and Electronic Systems202359(6): 9551-9561.
[19] SNOSWELL A J, SINGH S P N, YE N. Revisiting maximum entropy inverse reinforcement learning: New perspectives and algorithms[C]∥2020 IEEE Symposium Series on Computational Intelligence (SSCI). Piscataway: IEEE Press, 2020: 241-249.
[20] ZIEBART B D, MAAS A L, BAGNELL J A, et al. Maximum entropy inverse reinforcement learning[C]∥AAAI Conference on Artificial Intelligence. Washington, D.C.: AAAI, 2008: 1433-1438.
[21] 颜鹏, 郭继峰, 白成超. 考虑移动目标不确定行为方式的轨迹预测方法[J]. 宇航学报202243(8): 1040-1051.
  YAN P, GUO J F, BAI C C. A trajectory prediction method considering uncertain behavior patterns of moving targets[J]. Journal of Astronautics202243(8): 1040-1051 (in Chinese).
[22] YANG B, LU Y N, WAN R, et al. Meta-IRLSOT++: A meta-inverse reinforcement learning method for fast adaptation of trajectory prediction networks[J]. Expert Systems with Applications2024240: 122499.
[23] 李银通, 韩统, 孙楚, 等. 基于逆强化学习的空战态势评估函数优化方法[J]. 火力与指挥控制201944(8): 101-106.
  LI Y T, HAN T, SUN C, et al. An optimization method of air combat situation assessment function based on inverse reinforcement learning[J]. Fire Control & Command Control201944(8): 101-106 (in Chinese).
[24] 岳承磊, 汪雪川, 岳晓奎, 等. 基于逆强化学习的航天器交会对接方法[J]. 航空学报202344(19): 328420.
  YUE C L, WANG X C, YUE X K, et al. A spacecraft rendezvous and docking method based on inverse reinforcement learning[J]. Acta Aeronautica et Astronautica Sinica202344(19): 328420 (in Chinese).
[25] Levine S, Koltun V. Continuous inverse optimal control with locally optimal examples[C]∥Proceedings of the 29th International Coference on International Conference on Machine Learning. Madison: Omnipress, 2012: 475-482.
[26] KIM J, YANG I. Maximum entropy optimal control of continuous-time dynamical systems[J]. IEEE Transactions on Automatic Control202368(4): 2018-2033.
[27] BOYD S P, VANDENBERGHE L. Convex optimization[M]. Cambridge: Cambridge University Press, 2004: 226-227.
[28] 梁津鑫, 张晓阳, 崔颢, 等. 雷达/红外抗干扰融合跟踪方法研究[J]. 航空兵器202532(4): 88-94.
  LIANG J X, ZHANG X Y, CUI H, et al. Research on radar/IR anti-interference fusion tracking methods[J]. Aero Weaponry202532(4): 88-94 (in Chinese).
[29] LIM B, AR?K S ?, LOEFF N, et al. Temporal Fusion Transformers for interpretable multi-horizon time series forecasting[J]. International Journal of Forecasting202137(4): 1748-1764.
[30] PENTSOS V, TRAGOUDAS S, WIBBENMEYER J, et al. A hybrid LSTM-transformer model for power load forecasting[J]. IEEE Transactions on Smart Grid202516(3): 2624-2634.
文章导航

/