航空学报 > 2026, Vol. 47 Issue (6): 332633-332633   doi: 10.7527/S1000-6893.2025.32633

基于时空信息融合的多机协同空战决策方法

廉云霄1, 李霓1,2(), 谢锋1,3, 周攀1,4, 董长印1,2   

  1. 1.西北工业大学 航空学院,西安 710072
    2.飞行器基础布局全国重点实验室,西安 710072
    3.航空工业成都飞机设计研究所,成都 610041
    4.中国空气动力研究与发展中心 空天技术研究所,绵阳 621000
  • 收稿日期:2025-07-28 修回日期:2025-08-29 接受日期:2025-10-09 出版日期:2025-10-20 发布日期:2025-10-17
  • 通讯作者: 李霓 E-mail:lini@nwpu.edu.cn
  • 基金资助:
    国家自然科学基金(52372398);国家自然科学基金(62003272);国家自然科学基金(52302405)

A multi-UAV cooperative air combat decision-making method based on spatial-temporal information fusion

Yunxiao LIAN1, Ni LI1,2(), Feng XIE1,3, Pan ZHOU1,4, Changyin DONG1,2   

  1. 1.School of Aeronautics,Northwestern Polytechnical University,Xi’an 710072,China
    2.National Key Laboratory of Aircraft Configuration Design,Xi’an 710072,China
    3.AVIC Chengdu Aircraft Design and Research Institute,Chengdu 610041,China
    4.Institute of Space Technology,China Aerodynamics Research and Development Center,Mianyang 621000,China
  • Received:2025-07-28 Revised:2025-08-29 Accepted:2025-10-09 Online:2025-10-20 Published:2025-10-17
  • Contact: Ni LI E-mail:lini@nwpu.edu.cn
  • Supported by:
    National Natural Science Foundation of China(52372398)

摘要:

多智能体强化学习是当前实现多机自主协同空战最具潜力的方法之一。然而现有方法受限于端到端网络结构,在空战中存在多机协同性差和难以反映决策动机的关键性问题。为此,提出一种时空信息融合的多机协同空战决策方法以提升多机空战的协同性与可解释性。首先,设计了一种基于图注意力机制的空间信息融合方法聚合智能体局部观测并形成全局态势评估,增强了全连接评价网络信息融合效率和训练效率。其次,设计了一种交叉注意力和门控循环单元的时空信息融合方法聚合敌友方单元空间信息和时序信息,为策略网络融合协同性特征。最后,结合强化学习构建了时空信息融合的多机协同空战决策算法,并在高保真空战环境下进行了验证。实验结果表明:所提方法具有较强的协同性和决策动机的可解释性。

关键词: 多机协同空战, 多智能体强化学习, 时空信息融合, 图注意力, 交叉注意力

Abstract:

Multi-agent reinforcement learning is currently one of the most promising methods for achieving autonomous cooperative air combat among multiple aircraft. However, existing methods are constrained by the end-to-end network architecture, facing critical issues such as poor multi-UAV coordination and difficulty in reflecting decision-making motivation in air combat. To address these issues, this paper proposes a multi-UAV cooperative air combat decision-making method based on spatial-temporal information fusion to improve the cooperation and interpretability of multi-aircraft air combat. First, a spatial information fusion method based on graph attention mechanism is designed to aggregate local observations of agents and form global situation assessment, which enhances the information fusion efficiency and training efficiency of the fully connected evaluation network. Second, a spatial-temporal information fusion method combining cross-attention and gated recurrent unit is developed to aggregate spatial and temporal information of enemy and friendly units, fusing coordination features for the policy network. Finally, a spatial-temporal information fusion-based multi-UAV cooperative air combat decision-making algorithm is constructed by integrating reinforcement learning and validated in a high-fidelity air combat environment. Experimental results show that the proposed method exhibits strong coordination and interpretability of decision-making motivation.

Key words: multi-UAV cooperative air combat, multi-agent reinforcement learning, spatial-temporal information fusion, graph attention, cross-attention

中图分类号: