航空学报 > 2023, Vol. 44 Issue (13): 327895-327895   doi: 10.7527/S1000-6893.2023.27895

基于DQN的单一飞行员驾驶模式分布式多智能体联盟任务分配策略

董磊1,2,3, 陈泓兵2,3, 陈曦1,2,3, 赵长啸1,2,3()   

  1. 1.中国民航大学 民航航空器适航审定技术重点实验室,天津 300300
    2.中国民航大学 天津市民用航空器适航与维修重点实验室,天津 300300
    3.中国民航大学 安全科学与工程学院,天津 300300
  • 收稿日期:2022-08-03 修回日期:2022-11-30 接受日期:2023-02-23 出版日期:2023-07-15 发布日期:2023-03-10
  • 通讯作者: 赵长啸 E-mail:zhaochangxiao@yeah.net
  • 基金资助:
    国家重点研发计划(2021YFB1600600);天津市教委科研计划项目(2022KJ058);中央高校基本科研业务费项目中国民航大学专项资助(3122022044);中国民航大学研究生科研创新资助项目(2021YJS011)

Distributed multi-agent coalition task allocation strategy for single pilot operation mode based on DQN

Lei DONG1,2,3, Hongbing CHEN2,3, Xi CHEN1,2,3, Changxiao ZHAO1,2,3()   

  1. 1.Key Laboratory of Civil Aircraft Airworthiness Technology,Civil Aviation University of China,Tianjin 300300,China
    2.Civil Aircraft Airworthiness and Repair Key Laboratory of Tianjin,Civil Aviation University of China,Tianjin 300300,China
    3.College of Safety Science and Engineering,Civil Aviation University of China,Tianjin 300300,China
  • Received:2022-08-03 Revised:2022-11-30 Accepted:2023-02-23 Online:2023-07-15 Published:2023-03-10
  • Contact: Changxiao ZHAO E-mail:zhaochangxiao@yeah.net
  • Supported by:
    National Key Research and Development Program(2021YFB1600600);Tianjin Education Commission Scientific Research Project(2022KJ058);Fundamental Research Funds for the Central Universities(3122022044);Graduate Research Innovation Funding Project of Civil Aviation University of China(2021YJS011)

摘要:

分布式任务决策是提高单一飞行员驾驶(SPO)模式分布式协同飞行组织架构多智能体系统自主性的关键。以多智能体协作执行复杂任务为背景,首先构建了一种考虑任务载荷资源需求、智能体资源空间限制以及执行窗口等多约束条件的SPO模式分布式多智能体联盟任务分配决策模型;其次,对Q-估值网络函数逼近器进行了设计,提出了基于深度Q网络(DQN)的联盟任务分配方法,选择有效智能体生成最优联盟任务分配结果的最佳执行路径,使联盟中各智能体能够以更加自适应的方式实现调度优化;最后通过数值仿真,验证了DQN方法求解复杂约束条件下SPO模式多智能体联盟任务分配问题的有效性和快速性。

关键词: 单一飞行员驾驶, 多智能体系统, 任务分配, 联盟生成, 深度强化学习, 神经网络

Abstract:

Distributed decision-making is essential for increasing the autonomy of multi-agent system in the distributed coordinated flight organization structure of Single Pilot Operation (SPO) mode. A coalition task assignment decision model of distributed multi-agent for SPO mode is built on the background of multi-agent collaboration for the execution of complicated tasks, taking into account several constraints such as task load resource requirements, agent resource space, and time windows. Then, we design a function approximation of a Q-valued network, and propose a coalition task allocation algorithm based on Deep Q-Network (DQN) that generates the best execution path of the optimal coalition task allocation results, allowing each agent in the coalition to achieve scheduling optimization in a more adaptive manner. The efficiency and speed of the DQN algorithm in addressing multi-agent coalition task allocation for the SPO mode under complex constraints are confirmed through numerical simulation.

Key words: single pilot operation, multi-agent system, task allocation, coalition formation, deep reinforcement learning, neural network

中图分类号: