基于舰面环境模型的舰载机保障作业调度算法

罗祎喆; 王佳宝; 余新得; 陈旭东; 金钊; 冯硕; 石育澄; 徐明亮

doi:10.7527/S1000-6893.2026.33180

航空学报 >

0 1 - 0

DOI: https://doi.org/10.7527/S1000-6893.2026.33180

基于舰面环境模型的舰载机保障作业调度算法

罗祎喆 ,
王佳宝 ,
余新得 ,
陈旭东 ,
金钊 ,
冯硕 ,
石育澄 ,
徐明亮

展开

1. 郑州大学计算机与人工智能学院
2. 郑州大学计算机与人工智能学院
3. 郑州大学

收稿日期: 2025-12-03

修回日期: 2026-05-11

网络出版日期: 2026-05-14

基金资助

国家自然科学基金委员会;国家自然科学基金委员会;国家自然科学基金委员会;国家自然科学基金委员会;国家自然科学基金委员会

收起

A Scheduling Algorithm for Carrier-Based Aircraft Support Operations Based on the Deck Environment Model

LUO Yi-Zhe ,
WANG Jia-Bao ,
YU Xin-De ,
CHEN Xu-Dong ,
JIN Zhao ,
FENG Shuo ,
SHI Yu-Cheng ,
XU Ming-Liang

Expand

Received date: 2025-12-03

Revised date: 2026-05-11

Online published: 2026-05-14

Supported by

National Natural Science Foundation of china;National Natural Science Foundation of china;National Natural Science Foundation of china;National Natural Science Foundation of china;National Natural Science Foundation of china

Fold

摘要

在航母舰载机保障作业调度中，无模型强化学习（Model-free Reinforcement Learning, MFRL）在动态甲板场景下受到物理环境建模精度的制约，而基于模型的强化学习（Model-based Reinforcement Learning, MBRL）因环境模型与决策模型在迭代训练中存在相互依赖的协同优化问题，面临计算复杂度高与收敛困难的挑战。对此，本文提出了一种融合有模型与无模型特性的混合强化学习框架（MB-MF）。首先，利用历史调度数据训练基于深度神经网络的甲板环境模型，使其在最小容差范围内精确预测状态转移；然后，将收敛后的环境模型替代真实环境，内嵌入交互环境中，结合深度Q网络（Deep Q-Network, DQN）算法训练调度智能体，实现环境模型学习与策略优化的解耦；最后，经实验验证表明，与使用物理环境的MFRL相比，本方法在无需精确建模的情况下性能差距仅为4%。而相较于MBRL基线方法，舰载机出动时间则缩短34%。同时在资源受限场景中，决策速度较启发式方法提高近300倍，而调度质量仅降低17%。

关键词： 舰载机; 保障作业; 调度优化; 强化学习; 深度学习

本文引用格式

罗祎喆 , 王佳宝 , 余新得 , 陈旭东 , 金钊 , 冯硕 , 石育澄 , 徐明亮 . 基于舰面环境模型的舰载机保障作业调度算法[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2026.33180

Abstract

In aircraft carrier flight deck scheduling operations, model-free reinforcement learning (MFRL) is constrained by the precision of physical environment modeling under dynamic deck conditions. In contrast, model-based reinforcement learning (MBRL) faces challenges of high computational complexity and convergence difficulties due to the interdependent co-optimization between the environment model and the decision-making model during iterative training. To address these issues, this paper proposes a hybrid reinforcement learning framework (MB-MF) that integrates model-based and model-free characteristics. First, a deep neural network-based deck environment model is trained using historical scheduling data to accurately predict state transitions within a minimal tolerance range. Then, the converged environment model is embedded into the interactive environment in place of the real environment, where a scheduling agent is trained using the Deep Q-Network (DQN) algorithm, thereby decoupling environment model learning from policy optimization. Experimental results demonstrate that, compared to MFRL using the physical environment, the proposed method achieves a performance gap of only 4% without requiring precise modeling. Moreover, it reduces the aircraft sortie time by 34% relative to the MBRL baseline. In resource-constrained scenarios, the decision-making speed is nearly 300 times faster than that of heuristic methods, while the scheduling quality is reduced by only 17%.

Key words： carrier-based aircraft,; deck operations,; complex job scheduling,; reinforcement learning,; deep learning

Options

文章导航

地址：北京市海淀区北四环中路辅路238号柏彦大厦

邮政编码：100083

E-mail：hkxb@buaa.edu.cn

关于我们

期刊社服务

专业学科

封面文章

友情链接

主管单位：中国科学技术协会主办单位：中国航空学会北京航空航天大学

模态框（Modal）标题

摘要

本文引用格式

Abstract