基于舰面环境模型的舰载机保障作业调度算法

doi:10.7527/S1000-6893.2026.33180

本期目录 | 过刊浏览 | 高级检索

前一篇 | 后一篇

基于舰面环境模型的舰载机保障作业调度算法

罗祎喆¹,王佳宝²,余新得³,陈旭东²,金钊³,冯硕³,石育澄²,徐明亮³

1. 郑州大学计算机与人工智能学院
2. 郑州大学计算机与人工智能学院
3. 郑州大学

收稿日期:2025-12-03 修回日期:2026-05-11 出版日期:2026-05-14 发布日期:2026-05-14
通讯作者: 金钊
基金资助:
国家自然科学基金委员会;国家自然科学基金委员会;国家自然科学基金委员会;国家自然科学基金委员会;国家自然科学基金委员会

A Scheduling Algorithm for Carrier-Based Aircraft Support Operations Based on the Deck Environment Model

Received:2025-12-03 Revised:2026-05-11 Online:2026-05-14 Published:2026-05-14
Supported by:
National Natural Science Foundation of china;National Natural Science Foundation of china;National Natural Science Foundation of china;National Natural Science Foundation of china;National Natural Science Foundation of china

摘要/Abstract

摘要： 在航母舰载机保障作业调度中，无模型强化学习（Model-free Reinforcement Learning, MFRL）在动态甲板场景下受到物理环境建模精度的制约，而基于模型的强化学习（Model-based Reinforcement Learning, MBRL）因环境模型与决策模型在迭代训练中存在相互依赖的协同优化问题，面临计算复杂度高与收敛困难的挑战。对此，本文提出了一种融合有模型与无模型特性的混合强化学习框架（MB-MF）。首先，利用历史调度数据训练基于深度神经网络的甲板环境模型，使其在最小容差范围内精确预测状态转移；然后，将收敛后的环境模型替代真实环境，内嵌入交互环境中，结合深度Q网络（Deep Q-Network, DQN）算法训练调度智能体，实现环境模型学习与策略优化的解耦；最后，经实验验证表明，与使用物理环境的MFRL相比，本方法在无需精确建模的情况下性能差距仅为4%。而相较于MBRL基线方法，舰载机出动时间则缩短34%。同时在资源受限场景中，决策速度较启发式方法提高近300倍，而调度质量仅降低17%。

关键词: 舰载机, 保障作业, 调度优化, 强化学习, 深度学习

Abstract: In aircraft carrier flight deck scheduling operations, model-free reinforcement learning (MFRL) is constrained by the precision of physical environment modeling under dynamic deck conditions. In contrast, model-based reinforcement learning (MBRL) faces challenges of high computational complexity and convergence difficulties due to the interdependent co-optimization between the environment model and the decision-making model during iterative training. To address these issues, this paper proposes a hybrid reinforcement learning framework (MB-MF) that integrates model-based and model-free characteristics. First, a deep neural network-based deck environment model is trained using historical scheduling data to accurately predict state transitions within a minimal tolerance range. Then, the converged environment model is embedded into the interactive environment in place of the real environment, where a scheduling agent is trained using the Deep Q-Network (DQN) algorithm, thereby decoupling environment model learning from policy optimization. Experimental results demonstrate that, compared to MFRL using the physical environment, the proposed method achieves a performance gap of only 4% without requiring precise modeling. Moreover, it reduces the aircraft sortie time by 34% relative to the MBRL baseline. In resource-constrained scenarios, the decision-making speed is nearly 300 times faster than that of heuristic methods, while the scheduling quality is reduced by only 17%.

Key words: carrier-based aircraft,, deck operations,, complex job scheduling,, reinforcement learning,, deep learning

罗祎喆王佳宝余新得陈旭东金钊冯硕石育澄徐明亮. 基于舰面环境模型的舰载机保障作业调度算法[J]. 航空学报, doi: 10.7527/S1000-6893.2026.33180.

E-mail：hkxb@buaa.edu.cn

关于我们

期刊社服务

专业学科

封面文章

友情链接

主管单位：中国科学技术协会主办单位：中国航空学会北京航空航天大学

[1]	马宇卓, 任侃, 李涛, 陈钱. 基于距离损失提升航空图像语义分割研究[J]. 航空学报, 2026, 47(8): 332780-332780.
[2]	刘宇衡, 杨力, 黄琦龙. 基于可解释分层强化学习的防空反导策略优化[J]. 航空学报, 2026, 47(8): 332786-332786.
[3]	张皓, 刘家宁, 许志, 杨垣鑫. 飞机主动防御模式下改进逆强化学习的来袭弹轨迹预测方法[J]. 航空学报, 2026, 47(8): 332753-332753.
[4]	熊威, 张栋, 杨书恒, 任智, 刘文逸. 面向智能空战有人/无人机协同可解释方法[J]. 航空学报, 2026, 47(7): 332547-332547.
[5]	韩滟泷, 张安, 毕文豪, 范秋岑, 侯天乐. 基于DACTM-PPO的机载末端红外复合干扰智能决策[J]. 航空学报, 2026, 47(7): 332759-332759.
[6]	高思华, 赵炳阳, 李建伏. 基于时间窗约束的无人机完整性数据采集路径规划算法[J]. 航空学报, 2026, 47(6): 332451-332451.
[7]	彭健, 朱广磊, 吴庆顺, 李亚飞, 贺硕, 靳远远, 徐明亮. 基于蒙特卡洛树搜索的舰载机保障作业调度方法[J]. 航空学报, 2026, 47(6): 332444-332444.
[8]	廉云霄, 李霓, 谢锋, 周攀, 董长印. 基于时空信息融合的多机协同空战决策方法[J]. 航空学报, 2026, 47(6): 332633-332633.
[9]	黄俊, 张菁, 翁世倩. 机载光电目标识别算法综述[J]. 航空学报, 2026, 47(6): 332601-332601.
[10]	李乐言, 杨任农, 郭安新, 宋祺, 左家亮. 基于全域火力场的超视距空战威胁预测及动态逃逸方法[J]. 航空学报, 2026, 47(4): 332205-332205.
[11]	冯子成, 张文龙, 刘冬辉, 于起峰. 复杂背景下反无人机红外目标鲁棒跟踪算法[J]. 航空学报, 2026, 47(4): 332264-332264.
[12]	张磊, 田灿, 文方青, 张清河, 刘含. 面向移动边缘网络的多目标进化深度确定性策略梯度算法[J]. 航空学报, 2026, 47(3): 631880-631880.
[13]	马赞, 白杰, 闫励勤, 陈勇, 孙淑光. 基于贝叶斯优化的机载智能避让系统安全性评估[J]. 航空学报, 2026, 47(1): 331973-331973.
[14]	陶冶, 汤锦辉, 闫震, 周臣, 王冲. 融合表征转换与模式回归的航迹插补方法[J]. 航空学报, 2026, 47(1): 332106-332106.
[15]	章涛, 李攀, 王梓旭, 朱振华. 面向直升机姿态控制的强化学习奖励函数设计[J]. 航空学报, 2025, 46(S1): 732184-732184.

基于舰面环境模型的舰载机保障作业调度算法

A Scheduling Algorithm for Carrier-Based Aircraft Support Operations Based on the Deck Environment Model

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价