面向局部观测与通信受限下的察打一体自主决策方法（智能高速飞行器前沿技术专刊）

张栋; 傅晋博; 王孟阳; 沈潼

doi:10.7527/S1000-6893.2026.33373

航空学报 >

0 1 - 0

DOI: https://doi.org/10.7527/S1000-6893.2026.33373

面向局部观测与通信受限下的察打一体自主决策方法（智能高速飞行器前沿技术专刊）

张栋 ,
傅晋博 ,
王孟阳 ,
沈潼

展开

1. 西北工业大学航天学院
2. 西北工业大学
3. 西安电子科技大学

收稿日期: 2026-01-15

修回日期: 2026-05-07

网络出版日期: 2026-05-08

基金资助

自然科学基金;2025年度空基信息感知与融合全国重点实验室开放课题基金

收起

Autonomous decision-making method for integrated reconnaissance and strike operations under local observation and limited communication

ZHANG Dong ,
FU Jin-Bo ,
WANG Meng-Yang ,
SHEN Tong

Expand

Received date: 2026-01-15

Revised date: 2026-05-07

Online published: 2026-05-08

Fold

摘要

针对强对抗环境下通信网络动态碎片化与战场实体时变导致的无人机集群协同决策不连续及维度失配难题，提出一种异构图时空推理决策方法（Heterogeneous Graph Spatio-Temporal Reasoning, HG-STR）。首先，构建以单机为中心的局部动态异构图，利用元关系驱动的异构图Transformer提取无人机、动态目标与搜索区域间的语义拓扑特征，并通过门控循环单元构建时序记忆以补偿局部观测中断带来的决策震荡；其次，引入可学习的注意力通信机制，在物理链路受限及网络拓扑频繁割裂条件下实现关键协同信息的自适应筛选与高置信度聚合；最后，建立“上层战术博弈—下层指令执行”的分层架构，设计指针式多头策略网络，在统一框架内解决变长对象指派与资源量化分配的联合决策问题。构建了多区域察打任务的典型场景，仿真实验表明，相比传统规则算法任务完成率提升了37.14%；相比全局优化算法，单步决策耗时从秒级降低至毫秒级；且在通信半径极度受限的弱连通条件下仍能保持94%的任务成功率。

关键词： 无人机集群; 异构图注意力网络; 多智能体强化学习; 分层决策; 分布式协同; 察打协同

本文引用格式

张栋 , 傅晋博 , 王孟阳 , 沈潼 . 面向局部观测与通信受限下的察打一体自主决策方法（智能高速飞行器前沿技术专刊）[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2026.33373

Abstract

To address the challenges of discontinuous collaborative decision-making and dimensional mismatch in UAV swarms caused by the dynamic fragmentation of communication networks and the time-varying nature of battlefield entities in highly adversarial environments, a heterogeneous graph spatio-temporal reasoning (HG-STR) method is proposed. First, a local dynamic heterogeneous graph centered on individual UAVs is constructed. A meta-relation-driven heterogeneous graph Transformer is used to extract semantic topological features between UAVs, dynamic targets, and the search area. Temporal memory is constructed using gated recurrent units to compensate for decision-making oscillations caused by local observation interruptions. Second, a learnable attention communication mechanism is introduced to achieve adaptive filtering and high-confidence aggregation of key collaborative information under conditions of limited physical links and frequent network topology fragmentation. Finally, a hierarchical architecture of "upper-level tactical game—lower-level command execution" is established, and a pointer-based multi-head policy network is designed to solve the joint decision-making problem of variable-length object assignment and resource quantification allocation within a unified framework. A typical scenario for multi-area reconnaissance and strike missions was constructed. Simulation experiments show that the task completion rate is improved by 37.14% compared to traditional rule-based algorithms; compared to global optimization algorithms, the single-step decision-making time is reduced from seconds to milliseconds; and a 94% task success rate is maintained even under weak connectivity conditions with extremely limited communication radius.

Key words： UAV swarm; heterogeneous graph transformer; multi-agent reinforcement learning; hierarchical decision-making; distributed collaboration; reconnaissance and strike collaboration

参考文献

[1] 赵良瑾, 仝昊楠, 苑子杨, 等. 无人机集群的干扰管理：机理、技术与挑战[J]. 航空学报, 2025, 46(23): 632022. Zhao L J, Tong H N, Yuan Z Y, et al. Interference management for UAV swarms: Fundamental mechanisms, techniques, and challenges[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 632022. (in Chinese)[2] 王祥科, 刘志宏, 丛一睿, 等. 小型固定翼无人机集群综述和未来发展[J]. 航空学报, 2020, 41(4): 23732. Wang X K, Liu Z H, Cong Y R, et al. Miniature fixed-wing UAV swarms: Review and outlook[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(4): 23732. (in Chinese)[3] 卢毛毛, 刘春辉, 董赞亮. 面向区域覆盖的多无人机动态通信资源分配方法[J]. 北京航空航天大学学报, 2024, 50(9): 2939-2950. Lu M M, Liu C H, Dong Z L. Dynamic communication resource allocation for multi-UAV area coverage[J]. Journal of Beijing University of Aeronautics and Astronautics, 2024, 50(9): 2939-2950. (in Chinese) [4] Zhu C, Dastjerdi A V, Wang H, et al. A survey of multi-agent deep reinforcement learning with communication[J]. Autonomous Agents and Multi-Agent Systems, 2024, 38(1): 4.[5] 王文杉. 多无人机协同规划与自主决策方法研究[D]. 哈尔滨: 哈尔滨工程大学, 2024. Wang W S. Research on collaborative planning and autonomous decision-making methods for multi-UAVs[D]. Harbin: Harbin Engineering University, 2024. (in Chinese)[6] 王靖宇, 蒋平, 谢丁星, 等. 基于任务可靠性的无人集群作战试验方案设计[J]. 西北工业大学学报, 2025, 43(2): 368-380. Wang J Y, Jiang P, Xie D X, et al. Programming UAV swarm operational test based on mission reliability[J]. Journal of Northwestern Polytechnical University, 2025, 43(2): 368-380. (in Chinese)[7] Qin Y, Li C, Wang C, et al. Multiple unmanned aerial vehicles task allocation algorithm for agricultural scenarios based on improved non-dominated sorting genetic algorithm II[J]. Smart Agricultural Technology, 2026, 13: 101700.[8] Ning Z, Xie L. A survey on multi-agent reinforcement learning and its application[J]. Journal of Automation and Intelligence, 2024, 3(2): 73-91.[9] 罗彪, 胡天萌, 周育豪, 等. 多智能体强化学习控制与决策研究综述[J]. 自动化学报, 2025, 51(3): 510-539. Luo B, Hu T M, Zhou Y H, et al. Survey on Multi-agent Reinforcement Learning for Control and Decision-making [J]. Acta Automatica Sinica, 2025, 51(3): 510-539. (in Chinese)[10] Alam S, Zhang L, Liu Y, et al. Joint trajectory control, frequency allocation, and routing in UAV swarm networks: a multi-agent deep reinforcement learning approach[J]. IEEE Transactions on Mobile Computing, 2024, 23(12): 10245-10260.[11] 毕文豪, 张梦琦, 高飞, 等. 无人机集群任务分配技术研究综述[J]. 系统工程与电子技术, 2024, 46(3): 922-934. Bi W H, Zhang M Q, Gao F, et al. Review on UAV swarm task allocation technology [J]. Systems Engineering and Electronics, 2024, 46(3): 922-934. (in Chinese)[12] 李忠奎, 王俊杰, 张云奕, 等. 集群协同任务规划的形式逻辑方法: 综述与展望[J]. 自动化学报, 2025, 51(10): 2211-2231. Li Z K, Wang J J, Zhang Y Y, et al. Formal Logic-based Cooperative Task Planning for Multi-robot Systems: Survey of Recent Advances and Future Directions [J]. Acta Automatica Sinica, 2025, 51(10): 2211-2231. (in Chinese)[13] Ren Y, Wang H, Zhang P, et al. Soft hierarchical graph recurrent networks for multi-agent reinforcement learning[J]. Frontiers of Information Technology & Electronic Engineering, 2023, 24(9): 1289-1304.[14] Bing R, Yuan C, Zhang Z, et al. Heterogeneous graph neural networks analysis: a survey of techniques, evaluations and applications[J]. Artificial Intelligence Review, 2023, 56(6): 4801-4861.[15] Ren J, Xu Y, Li Z, et al. Scheduling UAV swarm with attention-based graph reinforcement learning for ground-to-air heterogeneous data communication[C]. Madrid: Proceedings of the 29th Annual International Conference on Mobile Computing and Networking (MobiCom), 2023: 1-15.[16] Tang Y, Huang Y, Hou J, et al. Type-adaptive graph Transformer for heterogeneous information networks[J]. Applied Intelligence, 2024, 54: 2165-2182.[17] Hu Z, Dong Y, Wang K, et al. Heterogeneous graph transformer[DB/OL]. arXiv: 2003.01332, 2020.[18] 王辰, 魏才盛, 殷泽阳, 等. 考虑信道资源约束的多无人机航迹与通信策略协同规划[J]. 航空学报, 2025, 46(5): 331837. Wang C, Wei C S, Yin Z Y, et al. Collaborative planning of multi-UAV trajectory and communication strategy considering channel resource constraints[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(5):331837. (in Chinese)[19] Zhou X, Yan S, Wu Q, et al. Joint UAV trajectory and communication design with multi-agent deep reinforcement learning[J]. Science China Information Sciences, 2024, 67(1): 112301.[20] 姜斌, 杜文博, 郭延奎, 等. 无人飞行器集群自主控制：基于联盟形成博弈的任务分配[J]. 自动化学报, 2025, 51(5): 889-902. Jiang B, Du W B, Guo Y K, et al. Autonomous Control of Unmanned Aerial Vehicle Swarms: Task Allocation Based on Coalition Formation Game [J]. Acta Automatica Sinica, 2025, 51(5): 889-902. (in Chinese)[21] Yin Y, Guo Y, Su Q, et al. Task allocation of multiple unmanned aerial vehicles based on deep transfer reinforcement learning[J]. Drones, 2022, 6(8): 215.[22] Skaltsis G M, Shin H S, Tsourdos A. A review of task allocation methods for UAVs[J]. Journal of Intelligent & Robotic Systems, 2023, 108(2): 36.[23] 武柯文, 赵斌, 谭雁英. 无人机群空基回收任务规划方法[J/OL]. 航空学报: 1-12. (2025-07-25) [2026-01-07]. https://link.cnki.net/urlid/11.1929.v.20250725.1653.014. Wu K W, Zhao B, Tan Y Y. Aerial recovery mission planning method for UAV swarm [J/OL]. Acta Aeronautica et Astronautica Sinica: 1-12. (2025-07-25) [2026-01-07]. https://link.cnki.net/urlid/11.1929.v.20250725.1653.014. (in Chinese)[24] Barenboim M, Indelman V. Online POMDP planning with anytime deterministic optimality guarantees[J]. Artificial Intelligence, 2026, 350: 104442.[25] 王秉坤, 王越, 杨妹, 等. 基于改进近端策略优化算法的无人车打击策略规划方法[J/OL]. 系统仿真学报: 1-13. (2025-10-15) [2026-01-07]. https://doi.org/10.16182/j.issn1004731x.joss.25-0486. Wang B K, Wang Y, Yang M, et al. Unmanned Vehicle Strike Strategy Planning Method Based on Improved Proximal Policy Optimization Algorithm [J/OL]. Journal of System Simulation: 1-13. (2025-10-15) [2026-01-07]. https://doi.org/10.16182/j.issn1004731x.joss.25-0486. (in Chinese)[26] 王孟阳, 张栋, 唐硕, 等. 复杂动态环境下多无人机目标跟踪的分布式协同轨迹规划方法[J]. 指挥与控制学报, 2024, 10(2): 197-212. Wang M Y, Zhang D, Tang S, et al. A Distributed Collaborative Trajectory Planning Method for Multi-UAV Targets Tracking in Complex Dynamic Environment [J]. Journal of Command and Control, 2024, 10(2): 197-212. (in Chinese)[27] 白瑞光, 孙鑫, 陈秋双, 等. 基于Gauss伪谱法的多UAV协同航迹规划[J]. 宇航学报, 2014, 35(9): 1022-1029. Bai R G, Sun X, Chen Q S, et al. Multiple UAV Cooperative Trajectory Planning Based on Gauss Pseudospectral Method[J]. Journal of Astronautics, 2014, 35(9): 1022-1029. (in Chinese)

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献