航空学报 > 2021, Vol. 42 Issue (8): 525792-525792   doi: 10.7527/S1000-6893.2021.25792

基于改进Q学习的IMA系统重构蓝图生成方法

罗庆1,2, 张涛3, 单鹏4, 张文涛3, 刘子豪3   

  1. 1. 航空工业沈阳飞机设计研究所, 沈阳 110035;
    2. 南京航空航天大学 航天学院, 南京 210016;
    3. 西北工业大学 软件学院, 西安 710072;
    4. 航空工业西安航空计算技术研究所, 西安 710065
  • 收稿日期:2021-04-15 修回日期:2021-05-08 发布日期:2021-05-31
  • 通讯作者: 张涛 E-mail:tao_zhang@nwpu.edu.cn
  • 基金资助:
    航空科学基金(2015ZD53055,20185853038,201907053004)

Generating reconfiguration blueprints for IMA systems based on improved Q-learning

LUO Qing1,2, ZHANG Tao3, SHAN Peng4, ZHANG Wentao3, LIU Zihao3   

  1. 1. AVIC Shenyang Aircraft Design and Research Institute, Shenyang 110035, China;
    2. School of Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China;
    3. School of Software, Northwestern Polytechnical University, Xi'an 710072, China;
    4. AVIC Xi'an Institute of Aeronautical Computing Technology, Xi'an 710065, China
  • Received:2021-04-15 Revised:2021-05-08 Published:2021-05-31
  • Supported by:
    Aeronautical Science Foundation of China (2015ZD53055, 20185853038, 201907053004)

摘要: 重构蓝图定义了故障状态下系统软硬件资源的重新配置方案,是实现综合模块化航空电子系统重构容错的关键。提出了一种基于改进Q学习的重构蓝图生成方法,综合考虑负载均衡、重构影响、重构时间、重构降级等多优化目标,并应用模拟退火框架改进探索策略,提高了传统Q学习算法的收敛性能。实验结果表明,与模拟退火算法、差分进化算法、传统Q学习算法相比,本文提出的改进Q学习算法效率更高,所生成重构蓝图质量更高。

关键词: 强化学习, Q学习, 模拟退火算法, 综合模块化航空电子系统, 多目标优化, 重构

Abstract: Reconfiguration blueprint defines the reconfiguration scheme of system hardware and software resources in the fault status, and is critical to reconfiguration fault tolerance of the integrated modular avionics system. In this paper, we propose an approach for generating reconfiguration blueprints based on improved Q-learning, which considers multiple optimization objectives such as load balance, reconfiguration impact, reconfiguration time, and reconfiguration degradation. The simulated annealing framework is utilized to enhance the convergence performance of the traditional Q-learning strategy. Experimental results demonstrate that compared with the simulated annealing algorithm, the differential evolution algorithm, and the traditional Q-learning algorithm, the algorithm proposed has higher efficiency, and can generate the reconfiguration blueprints of better quality.

Key words: reinforcement learning, Q-learning, simulated annealing algorithm, integrated modular avionics system, multi-objective optimization, reconfiguration

中图分类号: