航空学报 > 2025, Vol. 46 Issue (18): 331945-331945   doi: 10.7527/S1000-6893.2025.31945

面向舰载机多波次弹药保障任务的分层动态调度

罗祎喆1,2,3, 张辉1, 余新得1(), 金钊1,2,3, 冯朔1,2,3, 石育澄1,2,3, 徐明亮1,2,3   

  1. 1.郑州大学 计算机与人工智能学院,郑州 450001
    2.智能集群系统教育部工程研究中心,郑州 450001
    3.国家超级计算郑州中心,郑州 450001
  • 收稿日期:2025-03-06 修回日期:2025-03-19 接受日期:2025-05-13 出版日期:2025-09-25 发布日期:2025-06-06
  • 通讯作者: 余新得 E-mail:xdzzu2022@163.com
  • 基金资助:
    国家自然科学基金(62406292);国家自然科学基金(62302459);国家自然科学基金(62406293);国家自然科学基金(62325602);国家自然科学基金(62036010)

Hierarchical dynamic scheduling for multi-wave carrier-based aircraft ammunition support missions

Yizhe LUO1,2,3, Hui ZHANG1, Xinde YU1(), Zhao JIN1,2,3, Shuo FENG1,2,3, Yucheng SHI1,2,3, Mingling XU1,2,3   

  1. 1.School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450001,China
    2.Engineering Research Center of Intelligent Swarm Systems,Ministry of Education,Zhengzhou 450001,China
    3.National Supercomputing Center in Zhengzhou,Zhengzhou 450001,China
  • Received:2025-03-06 Revised:2025-03-19 Accepted:2025-05-13 Online:2025-09-25 Published:2025-06-06
  • Contact: Xinde YU E-mail:xdzzu2022@163.com
  • Supported by:
    National Natural Science Foundation of China(62406292)

摘要:

航空母舰舰载机弹药保障作业调度过程中,各类型转运设备与保障流程高度耦合,导致调度问题的状态空间呈现较强的非凸特性,若多波次待保障弹药数量较大,则进一步增大了搜索空间,致使弹药保障过程效率较低,难以满足任务的动态实时性要求。借鉴分而治之的思想,提出了一种基于分层强化学习的舰载机弹药保障作业动态调度方法。首先,将弹药保障作业的调度决策过程解耦,分别在顶层与底层分别执行,削弱调度问题非凸型及规模的影响。然后,在底层进行弹药转运设备的决策网络训练,并待其收敛后内嵌于顶层环境中,提供实时的底层反馈。同时,在顶层训练弹药保障顺序的决策网络,并设计资源预定机制,通过递推计算弹药转运时间确认各转运设备的可用时段,从而有效避免了对设备占用的冲突。最后,在典型任务场景下进行算法验证,结果表明,与优化算法相比,所提算法可在牺牲微小转运时间的前提下大幅提升决策实时性,同时兼顾了弹药保障时间和保障方案产出时间,可适用于强实时、高动态的保障任务。

关键词: 分层强化学习, 舰载机, 调度优化, 资源约束, 弹药保障作业

Abstract:

During the scheduling process of carrier-based aircraft ammunition support operations on aircraft carriers, the intricate interdependencies between various types of transfer equipment and support processes engender a highly non convex state space for the scheduling problem. Moreover, the substantial number of ammunition batches necessitating support further exacerbates the complexity by significantly expanding the search space, thereby diminishing the efficiency of the ammunition support process and impeding the ability to meet the dynamic real-time requirements of tasks. To address these challenges, this paper proposes a dynamic scheduling method for carrier-based aircraft ammunition support operations based on hierarchical reinforcement learning, inspired by the divide-and-conquer strategy. Initially, the scheduling decision process of ammunition support operations is decoupled and executed separately at the top and bottom levels, thereby alleviating the impact of the non-convexity and scale of the scheduling problem. Subsequently, decision network training for ammunition transfer equipment is conducted at the bottom level, and upon convergence, the trained model is integrated into the top-level environment to provide real-time feedback from the bottom level. Concurrently, at the top level, decision network training for ammunition support sequencing is performed, and a resource reservation mechanism is devised to recursively calculate ammunition transfer times, thereby determining the available time windows for transfer equipment and effectively circumventing conflicts in equipment usage. Ultimately, the proposed algorithm is validated in typical mission scenarios. The results indicate that, compared to traditional optimization algorithms, the proposed method substantially enhances decision-making real-time performance with only a minimal trade-off in scheduling time. It achieves a balanced trade-off between ammunition support time and the time required to generate support plans, rendering it well-suited for highly dynamic and strongly real-time support tasks.

Key words: hierarchical reinforcement learning, carrier-based aircraft, scheduling optimization, resource constraints, ammunition support operations

中图分类号: