有人机和无人机协同是未来空战的重要作战形式,其中深度强化学习是实现有人/无人机协同空战的关键技术。然而深度强化学习的“黑箱特性”,使得学习到的策略难理解、难信任,因此具备可解释性的深度强化学习是实现有人/无人机协同智能空战的关键。本文提出了一种基于Bayesian-Shapley框架的深度强化学习解释方法,实现了决策过程的可解释性建模与验证分析,达到辅助飞行员理解无人机决策依据的目标。该方法首先基于动态贝叶斯网络构建了协同任务的决策意图解析框架,能够定位航迹切片中的决策关键节点;其次采用Shapley贡献度评估算法,实现了对关键节点决策依据的状态级量化分析;最后通过重构深度强化学习模型的状态输入空间,在保持原有策略性能的同时显著提升了模型的可解释性和可信度,并通过状态空间消融仿真验证了解释结果的有效性。
Manned-unmanned aerial vehicle (UAV) cooperation represents a critical operational paradigm for future air combat, where deep reinforcement learning serves as a key enabling technology. However, the "black-box nature" of deep reinforcement learn-ing renders the learned strategies difficult to interpret and trust, making explainable deep reinforcement learning essential for achieving intelligent cooperative air combat. This paper proposes a Bayesian-Shapley framework-based explainable deep rein-forcement learning method, which enables interpretable modeling and verification of the decision-making process, thereby as-sisting pilots in understanding UAV decision logic. The proposed approach first constructs a decision intent analysis framework for cooperative missions using dynamic Bayesian networks, capable of identifying critical decision nodes in trajectory segments. Subsequently, it employs the Shapley value-based contribution assessment algorithm to achieve state-level quantitative analysis of decision rationale at key nodes. Finally, by reconstructing the state input space of the deep reinforcement learning model, the method significantly enhances model interpretability and trustworthiness while maintaining original policy performance, with the effectiveness of the explanatory results validated through state space ablation simulations.