随着航空人工智能技术的发展,综合模块化航电(IMA)平台上驻留的智能化应用软件面临着资源稀缺性问题和决策难信任问题。首先在考虑航电系统架构要素的基础上,设计了多约束条件下的IMA资源分配决策优化目标,利用深度强化学习算法求解序贯决策过程中IMA平台资源的近似最优分配方案。接下来,建立了一个面向IMA资源分配的决策归因解释框架,通过对训练数据集的聚合和重采样来提取强化学习智能体的专家策略,进而采用Shapley加性解释(SHAP)方法实现了全局与局部相结合的IMA资源分配决策解释。仿真实验与结果分析表明,相比贪婪算法和其他基于策略的强化学习算法,本文所提方法具有良好的收敛速度和学习效果,对解决IMA资源分配问题的高效性和优越性显著,且能伴随生成定量的、可视化的特征归因解释信息,揭示了强化学习输入特征对决策的影响程度及决策意图,为智能航电系统在可解释性方面的适航符合性验证提供了方法指导。
With the development of aviation artificial intelligence (AI) technology, the intelligent application software residing on the integrated modular avionics (IMA) platform faces the problem of resource scarcity and the problem of difficult trust in decision-making. Firstly, based on the consideration of avionics system architectural elements, the optimisation objective of IMA resource allocation decision-making under multiple constraints is designed, and a deep reinforcement learning algorithm is used to solve the near-optimal allocation scheme of IMA platform resources in the sequential decision-making process. Next, a decision attribution explanation framework for IMA resource allocation is established, and the expert policy of the reinforcement learning agent is extracted by aggregating and resampling the training dataset, and then the shapley additive explanation (SHAP) method is used to achieve a combined global and local explanation of IMA resource allocation decisions. Simulation experiments and result analysis show that compared with the greedy algorithm and other policy-based reinforcement learning algorithms, the method proposed in this paper has good convergence speed and learning effect, and is remarkably superior for solving the IMA resource allocation problem. Additionally, it generates quantitative and visual feature attribution explanations, which reveals the degree of influence and intention of reinforcement learning input features on decision-making, and provides methodological guidance for airworthiness compliance validation of AI-based avionics system in terms of explainability.