首页 >

面向智能空战的有人/无人机协同可解释方法

熊威1,张栋1,杨书恒1,任智1,2,刘文逸3   

  1. 1. 西北工业大学
    2. 西北工业大学 空天飞行器设计陕西省重点实验室
    3. 西北机电工程技术研究所
  • 收稿日期:2025-07-10 修回日期:2025-11-22 出版日期:2025-11-25 发布日期:2025-11-25
  • 通讯作者: 张栋
  • 基金资助:
    国家自然科学基金

A Manned/Unmanned Aerial Vehicle Cooperative Interpretable Method for Intelligent Air Combat

  • Received:2025-07-10 Revised:2025-11-22 Online:2025-11-25 Published:2025-11-25
  • Supported by:
    National Natural Science Foundation of China

摘要: 有人机和无人机协同是未来空战的重要作战形式,其中深度强化学习是实现有人/无人机协同空战的关键技术。然而深度强化学习的“黑箱特性”,使得学习到的策略难理解、难信任,因此具备可解释性的深度强化学习是实现有人/无人机协同智能空战的关键。本文提出了一种基于Bayesian-Shapley框架的深度强化学习解释方法,实现了决策过程的可解释性建模与验证分析,达到辅助飞行员理解无人机决策依据的目标。该方法首先基于动态贝叶斯网络构建了协同任务的决策意图解析框架,能够定位航迹切片中的决策关键节点;其次采用Shapley贡献度评估算法,实现了对关键节点决策依据的状态级量化分析;最后通过重构深度强化学习模型的状态输入空间,在保持原有策略性能的同时显著提升了模型的可解释性和可信度,并通过状态空间消融仿真验证了解释结果的有效性。

关键词: 人机协同, 强化学习, 可解释性, 智能空战, 意图识别

Abstract: Manned-unmanned aerial vehicle (UAV) cooperation represents a critical operational paradigm for future air combat, where deep reinforcement learning serves as a key enabling technology. However, the "black-box nature" of deep reinforcement learn-ing renders the learned strategies difficult to interpret and trust, making explainable deep reinforcement learning essential for achieving intelligent cooperative air combat. This paper proposes a Bayesian-Shapley framework-based explainable deep rein-forcement learning method, which enables interpretable modeling and verification of the decision-making process, thereby as-sisting pilots in understanding UAV decision logic. The proposed approach first constructs a decision intent analysis framework for cooperative missions using dynamic Bayesian networks, capable of identifying critical decision nodes in trajectory segments. Subsequently, it employs the Shapley value-based contribution assessment algorithm to achieve state-level quantitative analysis of decision rationale at key nodes. Finally, by reconstructing the state input space of the deep reinforcement learning model, the method significantly enhances model interpretability and trustworthiness while maintaining original policy performance, with the effectiveness of the explanatory results validated through state space ablation simulations.

Key words: Human machine collaboration, Deep reinforcement Learning, Interpretability, Intelligent air combat, Intention identification

中图分类号: