面向智能空战的有人/无人机协同可解释方法

  • 熊威 ,
  • 张栋 ,
  • 杨书恒 ,
  • 任智 ,
  • 刘文逸
展开
  • 1. 西北工业大学
    2. 西北工业大学 空天飞行器设计陕西省重点实验室
    3. 西北机电工程技术研究所

收稿日期: 2025-07-10

  修回日期: 2025-11-22

  网络出版日期: 2025-11-25

基金资助

国家自然科学基金

A Manned/Unmanned Aerial Vehicle Cooperative Interpretable Method for Intelligent Air Combat

  • XIONG Wei ,
  • ZHANG Dong ,
  • YANG Shu-Heng ,
  • REN Zhi ,
  • LIU Wen-Yi
Expand

Received date: 2025-07-10

  Revised date: 2025-11-22

  Online published: 2025-11-25

Supported by

National Natural Science Foundation of China

摘要

有人机和无人机协同是未来空战的重要作战形式,其中深度强化学习是实现有人/无人机协同空战的关键技术。然而深度强化学习的“黑箱特性”,使得学习到的策略难理解、难信任,因此具备可解释性的深度强化学习是实现有人/无人机协同智能空战的关键。本文提出了一种基于Bayesian-Shapley框架的深度强化学习解释方法,实现了决策过程的可解释性建模与验证分析,达到辅助飞行员理解无人机决策依据的目标。该方法首先基于动态贝叶斯网络构建了协同任务的决策意图解析框架,能够定位航迹切片中的决策关键节点;其次采用Shapley贡献度评估算法,实现了对关键节点决策依据的状态级量化分析;最后通过重构深度强化学习模型的状态输入空间,在保持原有策略性能的同时显著提升了模型的可解释性和可信度,并通过状态空间消融仿真验证了解释结果的有效性。

本文引用格式

熊威 , 张栋 , 杨书恒 , 任智 , 刘文逸 . 面向智能空战的有人/无人机协同可解释方法[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2025.32547

Abstract

Manned-unmanned aerial vehicle (UAV) cooperation represents a critical operational paradigm for future air combat, where deep reinforcement learning serves as a key enabling technology. However, the "black-box nature" of deep reinforcement learn-ing renders the learned strategies difficult to interpret and trust, making explainable deep reinforcement learning essential for achieving intelligent cooperative air combat. This paper proposes a Bayesian-Shapley framework-based explainable deep rein-forcement learning method, which enables interpretable modeling and verification of the decision-making process, thereby as-sisting pilots in understanding UAV decision logic. The proposed approach first constructs a decision intent analysis framework for cooperative missions using dynamic Bayesian networks, capable of identifying critical decision nodes in trajectory segments. Subsequently, it employs the Shapley value-based contribution assessment algorithm to achieve state-level quantitative analysis of decision rationale at key nodes. Finally, by reconstructing the state input space of the deep reinforcement learning model, the method significantly enhances model interpretability and trustworthiness while maintaining original policy performance, with the effectiveness of the explanatory results validated through state space ablation simulations.
文章导航

/