针对现有舰载机保障作业调度研究中存在的子任务耦合关系挖掘不足以及动态适应性受限等问题,本文研究了具有多阶段依赖关系的舰载机保障作业调度问题。首先,通过将保障站位分配与舰载机保障顺序决策建模为多智能体马尔科夫决策过程,建立了舰载机保障作业调度子任务间序贯耦合关系的数学表征;然后,提出了基于独立深度Q网络的多智能体协同决策框架,该框架采用了分布式训练-执行机制,具体包括保障站位分配模块、舰载机保障顺序决策模块和多智能体协同调度模块;进一步地,基于该框架提出了基于多阶段顺序决策机制的舰载机保障作业协同调度算法对模型进行求解;最后,仿真实验结果表明,所提算法收敛后的平均奖励值相较于Dueling DQN和N-step DQN方法分别提升27.08%、14.19%,奖励标准差相较于Dueling DQN和N-step DQN方法分别提升56.44%、45.43%,验证了多阶段协同决策机制在解决复杂调度问题中的有效性。
To address the insufficient exploration of subtask coupling relationships and limited dynamic adaptability in existing carrier-based aircraft support operation scheduling research, this study investigates a multi-stage scheduling problem for carrier-based aircraft support operations. Firstly, by modeling both support station allocation and aircraft servicing sequence determination as a multi-agent Markov decision process, we establish a mathematical characterization of the sequential coupling relationships between subtasks in support operation scheduling. Subsequently, an Independent Deep Q-network-based multi-agent collaborative decision-making framework is proposed, incorporating a distributed training-execution mechanism that specially includes a support station allocation module, an aircraft servicing sequence decision module, and a multi-agent collaborative scheduling module. Furthermore, a collaborative scheduling algorithm based on the multi-stage sequential decision-making mechanism is developed to solve the model. Finally, simulation results demonstrate that the proposed algorithm yields a 27.08% and 14.19% improvement in average reward, and a 56.44% and 45.43% improvement in reward standard deviation, over the Dueling DQN and N-step DQN methods, respectively, verifying the effectiveness of the multi-stage collaborative decision-making mechanism in addressing complex scheduling problems.