首页 >

飞机主动防御模式下改进逆强化学习的来袭弹轨迹预测方法研究

张皓1,刘家宁1,许志2,杨垣鑫1   

  1. 1. 西北工业大学
    2. 西北工业大学航天学院
  • 收稿日期:2025-09-03 修回日期:2025-11-17 出版日期:2025-11-20 发布日期:2025-11-20
  • 通讯作者: 许志

Research on Trajectory Prediction Method of Incoming Missiles Based on Im-proved Inverse Reinforcement Learning in Aircraft Active Defense Mode

Hao ZHANG1,Jia-Ning LIU2,Zhi XU 2   

  • Received:2025-09-03 Revised:2025-11-17 Online:2025-11-20 Published:2025-11-20
  • Contact: Zhi XU

摘要: 随着飞机火控系统和态势感知能力的提升,防御空空导弹的策略正由干扰、欺骗等被动方式,向拦截弹反来袭弹的主动防御模式演变,然而拦截弹的平均速度低、防御空间小,过载比不足以支撑传统比例导引法的精确碰撞要求,对来袭弹的轨迹预测提出了新的挑战。本文针对载机、来袭弹和拦截弹三体主动防御场景中拦截弹制导信息的高概率预测问题,提出了基于逆强化学习的来袭弹轨迹预测方法。首先,构建最大因果熵下对来袭弹机动时序特征提取的数学模型,并基于逆强化学习框架建立了来袭弹制导律的行为策略函数库;然后,推导了基于二次型的逆强化学习策略函数计算公式,降低了高维状态下策略函数计算的复杂度;最后,基于滚动窗口测量数据在线计算策略函数的加权系数,实时择优及自适应加权轨迹预测分布,形成来袭弹轨迹的实时预测模型。仿真验证表明,在三体主动防御场景下所提轨迹预测网络算法在“模型集/样本集外”具有强泛化能力,对复杂目标机动的动态适应性好、预测精度高,为防御来袭弹提供了可供制导使用的高概率轨迹预测模型,具有一定的理论应用意义和工程参考价值。

关键词: 三体主动防御场景, 制导导弹, 主动防御, 逆强化学习, 轨迹预测

Abstract: With the advancement of aircraft fire control systems and situational awareness capabilities, defense strategies against air-to-air missiles are evolving from passive methods such as jamming and deception to active defense modes involving interceptor missiles countering incoming threats. However, the low average velocity, limited defense space, and insufficient overload ratio of interceptor missiles hinder their ability to meet the precise collision requirements of traditional proportional navigation guidance, posing new challenges for trajectory prediction of incoming missiles. This paper addresses the high-probability prediction of guidance information for interceptor missiles in a three-body active defense scenario involving the carrier aircraft, incoming missile, and interceptor missile. A trajectory prediction method for incoming missiles based on inverse reinforcement learning is proposed. First, a mathematical model is constructed to extract the temporal maneuvering characteristics of incoming missiles under maximum causal entropy, and a be-havioral strategy library for the guidance law of incoming missiles is established within the inverse reinforcement learning framework. Then, a quadratic-based calculation formula for the inverse reinforcement learning strategy func-tion is derived, reducing the computational complexity of the strategy function in high-dimensional states. Finally, the weighting coefficients of the strategy function are computed online using rolling window measurement data, enabling real-time optimization and adaptive weighted trajectory prediction distribution to form a real-time prediction model for incoming missile trajectories. Simulation results demonstrate that the proposed trajectory prediction network algorithm exhibits strong generalization capability in "out-of-model-set/sample-set" scenarios within the three-body active de-fense context. It shows good dynamic adaptability to complex target maneuvers and high prediction accuracy, provid-ing a high-probability trajectory prediction model usable for guidance in defending against incoming missiles. This work holds theoretical significance and offers valuable insights for engineering applications.

Key words: Three-Body Active Defense Scenario, Guided missiles, Active defense, Inverse reinforcement learning, Trajectory prediction

中图分类号: