基于时空信息融合的多机协同空战决策方法

廉云霄; 李霓; 谢锋; 周攀; 董长印

doi:10.7527/S1000-6893.2025.32633

航空学报 >

2026 , Vol. 47 >Issue 6: 332633 - 332633

DOI: https://doi.org/10.7527/S1000-6893.2025.32633

电子电气工程与控制

基于时空信息融合的多机协同空战决策方法

廉云霄 ,
李霓 ,
谢锋 ,
周攀 ,
董长印

展开

^1.西北工业大学航空学院，西安 710072
^2.飞行器基础布局全国重点实验室，西安 710072
^3.航空工业成都飞机设计研究所，成都 610041
^4.中国空气动力研究与发展中心空天技术研究所，绵阳 621000

．E-mail： lini@nwpu.edu.cn

收稿日期: 2025-07-28

修回日期: 2025-08-29

录用日期: 2025-10-09

网络出版日期: 2025-10-17

基金资助

国家自然科学基金(52372398);国家自然科学基金(62003272);国家自然科学基金(52302405)

收起

A multi-UAV cooperative air combat decision-making method based on spatial-temporal information fusion

Yunxiao LIAN ,
Ni LI ,
Feng XIE ,
Pan ZHOU ,
Changyin DONG

Expand

^1.School of Aeronautics，Northwestern Polytechnical University，Xi’an 710072，China
^2.National Key Laboratory of Aircraft Configuration Design，Xi’an 710072，China
^3.AVIC Chengdu Aircraft Design and Research Institute，Chengdu 610041，China
^4.Institute of Space Technology，China Aerodynamics Research and Development Center，Mianyang 621000，China

E-mail： lini@nwpu.edu.cn

Received date: 2025-07-28

Revised date: 2025-08-29

Accepted date: 2025-10-09

Online published: 2025-10-17

Supported by

National Natural Science Foundation of China(52372398)

Fold

摘要

多智能体强化学习是当前实现多机自主协同空战最具潜力的方法之一。然而现有方法受限于端到端网络结构，在空战中存在多机协同性差和难以反映决策动机的关键性问题。为此，提出一种时空信息融合的多机协同空战决策方法以提升多机空战的协同性与可解释性。首先，设计了一种基于图注意力机制的空间信息融合方法聚合智能体局部观测并形成全局态势评估，增强了全连接评价网络信息融合效率和训练效率。其次，设计了一种交叉注意力和门控循环单元的时空信息融合方法聚合敌友方单元空间信息和时序信息，为策略网络融合协同性特征。最后，结合强化学习构建了时空信息融合的多机协同空战决策算法，并在高保真空战环境下进行了验证。实验结果表明：所提方法具有较强的协同性和决策动机的可解释性。

关键词： 多机协同空战; 多智能体强化学习; 时空信息融合; 图注意力; 交叉注意力

本文引用格式

廉云霄 , 李霓 , 谢锋 , 周攀 , 董长印 . 基于时空信息融合的多机协同空战决策方法[J]. 航空学报, 2026 , 47(6) : 332633 -332633 . DOI: 10.7527/S1000-6893.2025.32633

Abstract

Multi-agent reinforcement learning is currently one of the most promising methods for achieving autonomous cooperative air combat among multiple aircraft. However， existing methods are constrained by the end-to-end network architecture， facing critical issues such as poor multi-UAV coordination and difficulty in reflecting decision-making motivation in air combat. To address these issues， this paper proposes a multi-UAV cooperative air combat decision-making method based on spatial-temporal information fusion to improve the cooperation and interpretability of multi-aircraft air combat. First， a spatial information fusion method based on graph attention mechanism is designed to aggregate local observations of agents and form global situation assessment， which enhances the information fusion efficiency and training efficiency of the fully connected evaluation network. Second， a spatial-temporal information fusion method combining cross-attention and gated recurrent unit is developed to aggregate spatial and temporal information of enemy and friendly units， fusing coordination features for the policy network. Finally， a spatial-temporal information fusion-based multi-UAV cooperative air combat decision-making algorithm is constructed by integrating reinforcement learning and validated in a high-fidelity air combat environment. Experimental results show that the proposed method exhibits strong coordination and interpretability of decision-making motivation.

Key words： multi-UAV cooperative air combat; multi-agent reinforcement learning; spatial-temporal information fusion; graph attention; cross-attention

参考文献

[1]	樊会涛，闫俊. 空战体系的演变及发展趋势［J］. 航空学报， 2022， 43（10）： 527397.
	FAN H T， YAN J. Evolution and development trend of air combat system［J］. Acta Aeronautica et Astronautica Sinica， 2022， 43（10）： 527397 （in Chinese）.
[2]	孙智孝，杨晟琦，朴海音，等. 未来智能空战发展综述［J］. 航空学报， 2021， 42（8）： 525799.
	SUN Z X， YANG S Q， PIAO H Y， et al. A survey of air combat artificial intelligence［J］. Acta Aeronautica et Astronautica Sinica， 2021， 42（8）： 525799 （in Chinese）.
[3]	DEMAY C R， WHITE E L， DUNHAM W D， et al. AlphaDogfight trials： Bringing autonomy to air combat［J］. Johns Hopkins APL Technical Digest， 2022， 36（2）： 154-163.
[4]	POPE A P， IDE J S， MI?OVI? D， et al. Hierarchical reinforcement learning for air combat at DARPA’s AlphaDogfight trials［J］. IEEE Transactions on Artificial Intelligence， 2023， 4（6）： 1371-1385.
[5]	周攀，黄江涛，章胜，等. 基于深度强化学习的智能空战决策与仿真［J］. 航空学报， 2023， 44（4）： 126731.
	ZHOU P， HUANG J T， ZHANG S， et al. Intelligent air combat decision making and simulation based on deep reinforcement learning［J］. Acta Aeronautica et Astronautica Sinica， 2023， 44（4）： 126731 （in Chinese）.
[6]	周攀，李霓，黄江涛，等. 非完备信息下无人机近距博弈自主决策［J］.航空学报， 2025， 46（S1）： 732215.
	ZHOU P， LI N， HUANG J T， et al. Autonomous decision-making in close-range game under imperfect information for unmanned aerial vehicles ［J］. Acta Aeronautica et Astronautica Sinica， 2025， 46（S1）： 732215 （in Chinese）.
[7]	WANG D H， ZHANG J D， YANG Q M， et al. An autonomous attack decision-making method based on hierarchical virtual Bayesian reinforcement learning［J］. IEEE Transactions on Aerospace and Electronic Systems， 2024， 60（5）： 7075-7088.
[8]	DE MARCO A， D’ONZA P M， MANFREDI S. A deep reinforcement learning control approach for high-performance aircraft［J］. Nonlinear Dynamics， 2023， 111（18）： 17037-17077.
[9]	SALDIRAN E， HASANZADE M， INALHAN G， et al. Explainability of AI-driven air combat agent［C］∥2023 IEEE Conference on Artificial Intelligence （CAI）. Piscataway： IEEE Press， 2023： 85-86.
[10]	杨书恒，张栋，熊威，等. 基于可解释性强化学习的空战机动决策方法［J］. 航空学报， 2024， 45（18）： 329922.
	YANG S H， ZHANG D， XIONG W， et al. Decision-making method for air combat maneuver based on explainable reinforcement learning［J］. Acta Aeronautica et Astronautica Sinica， 2024， 45（18）： 329922 （in Chinese）.
[11]	SELMONAJ A， SZEHR O， DEL RIO G， et al. Hierarchical multi-agent reinforcement learning for air combat maneuvering［C］∥2023 International Conference on Machine Learning and Applications （ICMLA）. Piscataway： IEEE Press， 2023： 1031-1038.
[12]	李文韬，方峰，王振亚，等. 引入混合超网络改进MADDPG的双机编队空战自主机动决策［J］. 航空学报， 2024， 45（17）： 529460.
	LI W T， FANG F， WANG Z Y， et al. Intelligent maneuvering decision-making in two-UCAV cooperative air combat based on improved MADDPG with hybrid hyper network［J］. Acta Aeronautica et Astronautica Sinica， 2024， 45（17）： 529460 （in Chinese）.
[13]	XU X J， WANG Y F， GUO X， et al. Multi-UAV air combat cooperative game based on virtual opponent and value attention decomposition policy gradient［J］. Expert Systems with Applications， 2025， 267： 126069.
[14]	ZHOU Y M， YANG F， ZHANG C Y， et al. Cooperative decision-making algorithm with efficient convergence for UCAV formation in beyond-visual-range air combat based on multi-agent reinforcement learning［J］. Chinese Journal of Aeronautics， 2024， 37（8）： 311-328.
[15]	YAN Z H， LIANG X L， HOU Y Q， et al. A sample selection mechanism for multi-UCAV air combat policy training using multi-agent reinforcement learning［J］. Chinese Journal of Aeronautics， 2025， 38（6）： 103391.
[16]	JIANG F L， XU M Q， LI Y Q， et al. Short-range air combat maneuver decision of UAV swarm based on multi-agent Transformer introducing virtual objects［J］. Engineering Applications of Artificial Intelligence， 2023， 123： 106358.
[17]	WU J H， ZHANG N， LI D Y， et al. A context-aware feature fusion method for multi-UAV cooperative air combat［J］. IEEE Transactions on Intelligent Transportation Systems， 2025， 26（5）： 7197-7210.
[18]	BERNDT J. JSBSim： An open source flight dynamics model in C++［C］∥AIAA Modeling and Simulation Technologies Conference and Exhibit. Reston： AIAA， 2004.
[19]	YU C， VELU A， VINITSKY E， et al. The surprising effectiveness of PPO in cooperative multi-agent games［J］. Advances in Neural Information Processing Systems， 2022， 35： 24611-24624.
[20]	李霓，廉云霄，周攀，等. 面向智能空战的深度强化学习技术综述［J］. 航空工程进展， 2025， 16（3）： 1-16.
	LI N， LIAN Y X， ZHOU P， et al. A survey of deep reinforcement learning technologies for intelligent air combat［J］. Advances in Aeronautical Science and Engineering， 2025， 16（3）： 1-16 （in Chinese）.
[21]	VELI?KOVI? P， CUCURULL G， CASANOVA A， et al. Graph attention networks［DB/OL］. arXiv preprint： 1710.10903， 2017.
[22]	DEY R， SALEM F M. Gate-variants of gated recurrent unit （GRU） neural networks［C］∥2017 IEEE 60th International Midwest Symposium on Circuits and Systems （MWSCAS）. Piscataway： IEEE Press， 2017： 1597-1600.
[23]	ZHANG R Z， XU Z L， MA C D， et al. A survey on self-play methods in reinforcement learning［DB/OL］. arXiv preprint： 2408.01072， 2024.
[24]	PANG J H， HE J L， MOHAMED N， et al. A hierarchical reinforcement learning framework for multi-UAV combat using leader-follower strategy［DB/OL］. arXiv preprint： 2501.13132， 2025.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献