基于DACM-PPO的机载末端红外复合干扰智能决策

doi:10.7527/S1000-6893.2025.32759

本期目录 | 过刊浏览 | 高级检索

前一篇 | 后一篇

基于DACM-PPO的机载末端红外复合干扰智能决策

韩滟泷¹,张安¹,毕文豪¹,范秋岑²,侯天乐¹

1. 西北工业大学
2. 西北工业大学航空学院

收稿日期:2025-09-08 修回日期:2025-12-04 出版日期:2025-12-08 发布日期:2025-12-08
通讯作者: 毕文豪
基金资助:
面向动态多任务的无人机多模集群敏捷协同控制研究

Intelligent decision-making of airborne terminal infrared composite jamming based on DACM-PPO

Received:2025-09-08 Revised:2025-12-04 Online:2025-12-08 Published:2025-12-08
Contact: Wen-Hao BI

摘要/Abstract

摘要： 随着红外制导空空导弹制导精度和机动能力的不断提升，作战飞机通过机动规避或单一红外干扰难以有效规避红外导弹命中风险，红外复合干扰成为保障飞机生存的重要途径。针对机载末端红外复合干扰问题，提出了一种基于改进近端策略优化算法的机载末端红外复合干扰智能决策方法。从机载末端对抗场景出发，分析了作战飞机在红外制导导弹攻击下的决策约束，建立了红外诱饵弹与激光定向干扰模型，提出了一种动态非对称裁剪机制，克服裁剪参数固定僵化的局限，提升收敛效率与求解质量，在此基础上设计了融合干扰手段特性的奖励函数，并引入过量使用惩罚项和无效使用惩罚项，实现干扰效能与资源消耗之间的合理平衡。仿真结果表明，红外复合干扰智能决策方法够以合理的协同方式组织红外干扰手段，在多种典型机弹对抗态势下表现出良好性能，相较原始近端策略优化算法、柔性动作-评价算法及基于预设规则的方法，在飞机存活率、导弹脱靶量和资源利用效率等指标上均具有显著优势，具有良好应用价值。

关键词: 机载末端防御, 红外复合干扰, 深度强化学习, 红外诱饵弹, 激光定向干扰

Abstract: With the continuous improvement in the guidance accuracy and maneuverability of infrared-guided air-to-air missiles, combat aircraft find it increasingly difficult to effectively evade the risk of infrared missile hits through maneuvering avoidance or single infrared countermeasures alone. Composite infrared countermeasures have thus become a critical means to ensure aircraft survivability. Addressing the challenge of airborne terminal composite infrared countermeasures, this study proposes an intelligent decision-making method based on an improved proximity strategy optimization algorithm. Starting from the terminal airborne countermeasure scenario, this study analyzes the decision constraints faced by combat aircraft under infrared-guided missile attacks. It establishes models for infrared decoy missiles and laser directional jammers, proposing a dynamic asymmetric trimming mechanism to overcome the limitations of fixed trimming parameters, thereby enhancing convergence efficiency and solution quality. Building upon this foundation, a reward function integrating jamming means characteristics is designed, incorporating overuse penalties and ineffective use penalties to achieve a reasonable balance between jamming effectiveness and resource consumption. Simulation results demonstrate that the intelligent decision-making method for infrared composite jamming can organize infrared jamming measures in a reasonably coordinated manner, exhibiting excellent performance under various typical aircraft-missile confrontation scenarios. Compared with the original near-end strategy optimization algorithm, the flexible action-evaluation algorithm, and the preset rule-based method, it shows significant advantages in metrics such as aircraft survivability, missile miss distance, and resource utilization efficiency, demonstrating good application value.

Key words: airborne terminal defense, infrared composite jamming, deep reinforcement learning, infrared decoy bombs, laser directional jamming

中图分类号:

V279

韩滟泷张安毕文豪范秋岑侯天乐. 基于DACM-PPO的机载末端红外复合干扰智能决策[J]. 航空学报, doi: 10.7527/S1000-6893.2025.32759.

参考文献

[1]CHEN C, MO L, LV M, et al.Enhanced missile hit probability actor-critic algorithm for autonomous decision-making in air-to-air confrontation[J].Aerospace Science and Technology, 2024, 151:109285-
[2]SONAWANE H R, MAHULIKAR S P.Tactical air warfare: Generic model for aircraft susceptibility to infrared guided missiles[J].Aerospace Science and Technology, 2011, 15(4):249-260
[3]GONG X, CHEN W, CHEN Z.All-aspect attack guidance law for agile missiles based on deep reinforcement learning[J].Aerospace Science and Technology, 2022, 127::107677-
[4]TIANBO D, HUANG H, FANG Y W, et al.Reinforcement learning-based missile terminal guidance of maneuvering targets with decoys[J].Chinese Journal of Aeronautics, 2023, 36(12):309-324
[5]DEBNATH S, REJ P, KUMAR H, et al.A computational model for prediction of IR intensity and burn time of Magnesium-Teflon-Viton (MTV) based Infrared (IR) decoy flare of various configurations[J].Infrared Physics & Technology, 2025, 145:105651-
[6]吴晓迪, 黄超超.多枚红外诱饵弹运动轨迹仿真[J].激光与红外, 2015, 45(12):1473-1476
[7]WU X D, HUANG C C.Simulation for the motion traces of infrared decoys[J].Laser & Infrared, 2015, 45(12):1473-1476
[8]SHI L, PEI Y, YUN Q, et al.Agent-based effectiveness evaluation method and impact analysis of airborne laser weapon system in cooperation combat[J].Chinese Journal of Aeronautics, 2023, 36(4):442-454
[9]王炜强, 贾晓洪, 韩宇萌, 等.定向干扰激光的红外成像建模与仿真[J].红外与激光工程, 2016, 45(06):51-56
[10]WANG W Q, JIA X H, HAN Y M, et al.Infrared imaging modeling and simulation of DIRCM laser[J].Infrared and Laser Engineering, 2016, 45(06):51-56
[11]张颜伟, 白春华, 蔡猛.红外干扰弹与定向红外对抗系统协同使用研究[J].电光与控制, 2023, 30(02):82-85
[12]ZHANG Y W, BAI C H, CAI M.Cooperative usage of infrared jamming projectile and directional infrared countermeasure system[J].Electronics Optics & Control, 2023, 30(02):82-85
[13]白杨, 张成, 王博宇, 等.机载末端红外对抗作战效能仿真研究[J].红外与激光工程, 2022, 51(11):149-158
[14]BAI Y, ZHANG C, WANG B Y, et al.Simulation of airborne terminal infrared countermeasure operational effectiveness[J].Infrared and Laser Engineering, 2022, 51(11):149-158
[15]PIAO H, HAN Y, CHEN H, et al.Complex relationship graph abstraction for autonomous air combat collaboration: A learning and expert knowledge hybrid approach[J].Expert Systems with Applications, 2023, 215:119285-
[16]徐西蒙, 魏贤智, 张涛, 等.基于混沌粒子群优化算法的战斗机使用空射诱饵的攻击决策[J].电光与控制, 2015, 22(11):42-47
[17]XU X M, WEI X Z, ZHANG T, et al.CPSO based decision-making of fighters using miniature air launched decoy[J].Electronics Optics & Control, 2015, 22(11):42-47
[18]张涛, 周中良, 于雷, 等.战斗机使用空射诱饵弹协同规避策略[J].系统工程与电子技术, 2017, 39(12):2738-2744
[19]ZHANG T, ZHOU Z L, YU L, et al.Coordinated evasion strategy for MALD and fighter in air combat[J].Systems Engineering and Electronics, 2017, 39(12):2738-2744
[20]BAYRAK A E, POLAT F.Employment of an evolutionary heuristic to solve the target allocation problem efficiently[J].Information Sciences, 2013, 222:675-695
[21]LI Y, HAN W, WANG Y Q.Deep reinforcement learning with application to air confrontation intelligent decision-making of manned/unmanned aerial vehicle cooperative system[J].IEEE Access, 2020, 8:67887-67898
[22]李传浩, 明振军, 王国新, 等.基于多智能体深度强化学习的无人平台箔条干扰末端防御动态决策方法[J].兵工学报, 2025, 46(03):21-35
[23]LI C H, MING Z J, WANG G X, et al.Dynamic Decision-making Method of Unmanned Platform Chaff Jamming for Terminal Defense Based on Multi-agent Deep Reinforcement Learning[J].Acta Armamentarii, 2025, 46(03):21-35
[24]黄成, 邱志聪, 许家忠.地月环境下航天器近距离接近自主决策[J].光学精密工程, 2025, 33(06):979-992
[25]HUANG C, QIU Z C, XU J Z.Autonomous decision-making for spacecraft close approach in the Earth-Moon environment[J].Optics and Precision Engineering, 2025, 33(06):979-992
[26]李波, 越凯强, 甘志刚, 等.基于的多无人机协同任务决策[J].宇航学报, 2021, 42(06):757-765
[27]LI B, YUE K Q, GAN Z G, et al.Multi-UAV cooperative Autonomous Navigation Based on multi-agent deep deterministic policy gradient[J].Journal of Astronautics, 2021, 42(06):757-765
[28]ZETIAN H U, LIANG X, ZHANG J, et al.Exploring crash induction strategies in within-visual-range air combat based on distributional reinforcement learning[J].Chinese Journal of Aeronautics, 2025, 38(9):103663-
[29]WANG W, RU L, LV M, et al.Dynamic and adaptive learning for autonomous decision-making in beyond visual range air combat[J].Aerospace Science and Technology, 2025, :110327-
[30]王存灿, 王晓芳, 林海.一种元学习和强化学习结合的多飞行器协同制导律[J].兵工学报, 2025, :1-15
[31]WANG C C, WANG X F, LIN H.A cooperative guidance law for multi-aircraft combining meta-learning and reinforcement learning[J].Acta Armamentarii, 2025, :1-15
[32]RAO G A, MAHULIKAR S P.New criterion for aircraft susceptibility to infrared guided missiles[J].Aerospace science and technology, 2005, 9(8):701-712

E-mail：hkxb@buaa.edu.cn

关于我们

期刊社服务

专业学科

封面文章

友情链接

主管单位：中国科学技术协会主办单位：中国航空学会北京航空航天大学

基于DACM-PPO的机载末端红外复合干扰智能决策

Intelligent decision-making of airborne terminal infrared composite jamming based on DACM-PPO

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	万开方, 吴志林, 武韫晖, 强皓植, 吴艺博, 李波. 拒止环境下基于深度强化学习的多无人机协同定位[J]. 航空学报, 2025, 46(8): 331024-331024.
[2]	姜凌峰, 李新凯, 张海, 李涵玮, 张宏立. 基于改进TD3算法的无人机动态环境无地图导航[J]. 航空学报, 2025, 46(8): 331035-331035.
[3]	杨敏, 刘关俊, 周子渊. 基于安全强化学习的月球着陆器控制[J]. 航空学报, 2025, 46(3): 630553-630553.
[4]	王辰, 魏才盛, 殷泽阳, 靳锴, 李星辰. 考虑信道资源约束的多无人机航迹与通信策略协同规划[J]. 航空学报, 2025, 46(18): 331837-331837.
[5]	王昱, 谢志鹏, 田永健, 孟光磊. 虚拟结构引领强化学习分布式无人机编队控制[J]. 航空学报, 2025, 46(15): 331354-331354.
[6]	陈伟, 李璐璐, 陈董, 张少辉, 李亚飞, 王可, 靳远远, 徐明亮. 差异化保障需求驱动的舰载机多机协同决策方法[J]. 航空学报, 2025, 46(13): 531274-531274.
[7]	陈旭东, 陈琦琦, 罗祎喆, 王佳宝, 徐明亮. 异构舰载机舰面保障作业动态并行调度[J]. 航空学报, 2025, 46(13): 531329-531329.
[8]	王政, 王华, 崔可可, 李超超, 刘俊楠, 徐明亮. 局部引导强化学习的舰载机自主调运方法[J]. 航空学报, 2025, 46(13): 531333-531333.
[9]	凌文辉, 牟春晖, 聂聆聪, 杜宪, 孙希明. 基于改进DDPG的宽速域几何可调燃烧室压力分布控制[J]. 航空学报, 2025, 46(12): 131092-131092.
[10]	余子杰, 郑征, 李清东, 郭林, 任素萍, 郭健. 基于深度强化学习的太阳能无人机航迹规划[J]. 航空学报, 2025, 46(12): 331420-331420.
[11]	高树一, 林德福, 郑多, 徐骋. 考虑拦截器探测能力限制的飞行器智能机动突防制导策略[J]. 航空学报, 2025, 46(10): 331304-331304.
[12]	张鸿林, 罗建军, 马卫华. 基于机器学习的航天器规避目标威胁博弈决策[J]. 航空学报, 2024, 45(8): 329136-329136.
[13]	蔡云鹏, 周大鹏, 丁江川. 具有防撞安全约束的无人机集群智能协同控制[J]. 航空学报, 2024, 45(5): 529683-529683.
[14]	单圣哲, 张伟伟. 基于自博弈深度强化学习的空战智能决策方法[J]. 航空学报, 2024, 45(4): 328723-328723.
[15]	高兵, 张哲婕, 邹启杰, 刘治国, 赵锡玲. 基于深度强化学习和信息论的多智能体通信方法[J]. 航空学报, 2024, 45(18): 329862-329862.