电子电气工程与控制

基于DQN的单一飞行员驾驶模式分布式多智能体联盟任务分配策略

  • 董磊 ,
  • 陈泓兵 ,
  • 陈曦 ,
  • 赵长啸
展开
  • 1.中国民航大学 民航航空器适航审定技术重点实验室,天津 300300
    2.中国民航大学 天津市民用航空器适航与维修重点实验室,天津 300300
    3.中国民航大学 安全科学与工程学院,天津 300300
.E-mail: zhaochangxiao@yeah.net

收稿日期: 2022-08-03

  修回日期: 2022-11-30

  录用日期: 2023-02-23

  网络出版日期: 2023-03-10

基金资助

国家重点研发计划(2021YFB1600600);天津市教委科研计划项目(2022KJ058);中央高校基本科研业务费项目中国民航大学专项资助(3122022044);中国民航大学研究生科研创新资助项目(2021YJS011)

Distributed multi-agent coalition task allocation strategy for single pilot operation mode based on DQN

  • Lei DONG ,
  • Hongbing CHEN ,
  • Xi CHEN ,
  • Changxiao ZHAO
Expand
  • 1.Key Laboratory of Civil Aircraft Airworthiness Technology,Civil Aviation University of China,Tianjin 300300,China
    2.Civil Aircraft Airworthiness and Repair Key Laboratory of Tianjin,Civil Aviation University of China,Tianjin 300300,China
    3.College of Safety Science and Engineering,Civil Aviation University of China,Tianjin 300300,China

Received date: 2022-08-03

  Revised date: 2022-11-30

  Accepted date: 2023-02-23

  Online published: 2023-03-10

Supported by

National Key Research and Development Program(2021YFB1600600);Tianjin Education Commission Scientific Research Project(2022KJ058);Fundamental Research Funds for the Central Universities(3122022044);Graduate Research Innovation Funding Project of Civil Aviation University of China(2021YJS011)

摘要

分布式任务决策是提高单一飞行员驾驶(SPO)模式分布式协同飞行组织架构多智能体系统自主性的关键。以多智能体协作执行复杂任务为背景,首先构建了一种考虑任务载荷资源需求、智能体资源空间限制以及执行窗口等多约束条件的SPO模式分布式多智能体联盟任务分配决策模型;其次,对Q-估值网络函数逼近器进行了设计,提出了基于深度Q网络(DQN)的联盟任务分配方法,选择有效智能体生成最优联盟任务分配结果的最佳执行路径,使联盟中各智能体能够以更加自适应的方式实现调度优化;最后通过数值仿真,验证了DQN方法求解复杂约束条件下SPO模式多智能体联盟任务分配问题的有效性和快速性。

本文引用格式

董磊 , 陈泓兵 , 陈曦 , 赵长啸 . 基于DQN的单一飞行员驾驶模式分布式多智能体联盟任务分配策略[J]. 航空学报, 2023 , 44(13) : 327895 -327895 . DOI: 10.7527/S1000-6893.2023.27895

Abstract

Distributed decision-making is essential for increasing the autonomy of multi-agent system in the distributed coordinated flight organization structure of Single Pilot Operation (SPO) mode. A coalition task assignment decision model of distributed multi-agent for SPO mode is built on the background of multi-agent collaboration for the execution of complicated tasks, taking into account several constraints such as task load resource requirements, agent resource space, and time windows. Then, we design a function approximation of a Q-valued network, and propose a coalition task allocation algorithm based on Deep Q-Network (DQN) that generates the best execution path of the optimal coalition task allocation results, allowing each agent in the coalition to achieve scheduling optimization in a more adaptive manner. The efficiency and speed of the DQN algorithm in addressing multi-agent coalition task allocation for the SPO mode under complex constraints are confirmed through numerical simulation.

参考文献

1 王淼, 肖刚, 王国庆. 单一飞行员驾驶模式技术[J]. 航空学报202041(4): 323541.
  WANG M, XIAO G, WANG G Q. Single pilot operation mode technology[J]. Acta Aeronautica et Astronautica Sinica202041(4): 323541 (in Chinese).
2 LUO Y, WANG M, CHEN Y, et al. TFCluster: An efficient algorithm to mine maximal differential function-resource biclusters for single pilot operations safety analysis[C]∥ 2021 IEEE/AIAA 40th Digital Avionics Systems Conference (DASC). Piscataway: IEEE Press, 2021: 1-6.
3 BILIMORIA K D, JOHNSON W W, SCHUTTE P C. Conceptual framework for single pilot operations[C]∥ Proceedings of the International Conference on Human-Computer Interaction in Aerospace. New York: ACM, 2014: 1-8.
4 STANTON N A, HARRIS D, STARR A. Modelling and analysis of single pilot operations in commercial aviation[C]∥ Proceedings of the International Conference on Human-Computer Interaction in Aerospace. New York: ACM, 2014: 1–8.
5 NEIS S M, KLINGAUF U, SCHIEFELE J. Classification and review of conceptual frameworks for commercial single pilot operations[C]∥ 2018 IEEE/AIAA 37th Digital Avionics Systems Conference (DASC). Piscataway: IEEE Press, 2018: 1-8.
6 陈璞, 严飞, 刘钊, 等. 通信约束下异构多无人机任务分配方法[J]. 航空学报202142(8): 525844.
  CHEN P, YAN F, LIU Z, et al. Communication-constrained task allocation of heterogeneous UAVs[J]. Acta Aeronautica et Astronautica Sinica202142(8): 525844 (in Chinese).
7 柳平, 胡孟权, 胡文东, 等. 作战飞机人机功能分配方法[J]. 火力与指挥控制201237(12): 19-22.
  LIU P, HU M Q, HU W D, et al. Search after methods of man-machine function allocation of combat aircraft[J]. Fire Control & Command Control201237(12): 19-22 (in Chinese).
8 JOHNSON A W, OMAN C M, SHERIDAN T B, et al. Dynamic task allocation in operational systems: Issues, gaps, and recommendations[C]∥ 2014 IEEE Aerospace Conference. Piscataway: IEEE Press, 2014: 1-15.
9 HARRIS D, STANTON N A, STARR A. Spot the difference: Operational event sequence diagrams as a formal method for work allocation in the development of single-pilot operations for commercial aircraft[J]. Ergonomics201558(11): 1773-1791.
10 HUDDLESTONE J, SEARS R, HARRIS D. The use of operational event sequence diagrams and work domain analysis techniques for the specification of the crewing configuration of a single-pilot commercial aircraft[J]. Cognition, Technology and Work201719(2-3): 289–302.
11 DORNEICH M C, PASSINGER B, HAMBLIN C, et al. Evaluation of the display of cognitive state feedback to drive adaptive task sharing[J]. Frontiers in Neuroscience201711: 144.
12 张安, 任卫, 汤志荔, 等. 基于CTL模型和任务绩效的驾驶舱动态功能分配方法[J]. 火力与指挥控制201843(7): 151-156.
  ZHANG A, REN W, TANG Z L, et al. Dynamic function allocation for cockpit based on CTL model and task performance[J]. Fire Control & Command Control201843(7): 151-156 (in Chinese).
13 唐嘉钰, 李相民, 代进进, 等. 复杂约束条件下异构多智能体联盟任务分配[J]. 控制理论与应用202037(11): 2413-2422.
  TANG J Y, LI X M, DAI J J, et al. Coalition task allocation of heterogeneous multiple agents with complex constraints[J]. Control Theory & Applications202037(11): 2413-2422 (in Chinese).
14 TOKADL G, DORNEICH M C, MATESSA M. Evaluation of playbook delegation approach in human-autonomy teaming for single pilot operations[J]. International Journal of Human-Computer Interaction202137(7): 703-716.
15 SUN Y, WANG J, SUN Y, et al. Dynamic worker-and-task assignment on uncertain spatial crowdsourcing[C]∥ 2018 IEEE 22nd International Conference on Computer Supported Cooperative Work in Design (CSCWD). Piscataway: IEEE Press, 2018: 755-760.
16 HE M L, LI Y, WANG X F, et al. NOMA resource allocation method in IoV based on prioritized DQN-DDPG network[J]. EURASIP Journal on Advances in Signal Processing20212021(1): 120.
17 HAN S, LI L, LI X B. Deep Q-network-based cooperative transmission joint strategy optimization algorithm for energy harvesting-powered underwater acoustic sensor networks[J]. Sensors202020(22): 6519.
18 CHEN J J, GUO C L, FENG C Y, et al. Content driven and reinforcement learning based resource allocation scheme in vehicular network[C]∥ ICC 2021 - IEEE International Conference on Communications. Piscataway: IEEE Press, 2021: 1-6.
19 刘冰雁, 叶雄兵, 周赤非, 等. 基于改进DQN的复合模式在轨服务资源分配[J]. 航空学报202041(5): 323630.
  LIU B Y, YE X B, ZHOU C F, et al. Allocation of composite mode on-orbit service resource based on improved DQN[J]. Acta Aeronautica et Astronautica Sinica202041(5): 323630 (in Chinese).
20 SUN Y, TAN W A. A trust-aware task allocation method using deep Q-learning for uncertain mobile crowdsourcing[J]. Human-Centric Computing and Information Sciences20199(1): 1-27.
21 SUN Y H, PENG M G, MAO S W. Deep reinforcement learning-based mode selection and resource management for green fog radio access networks[J]. IEEE Internet of Things Journal20196(2): 1960-1971.
22 罗庆, 张涛, 单鹏, 等. 基于改进Q学习的IMA系统重构蓝图生成方法[J]. 航空学报202142(8): 525792.
  LUO Q, ZHANG T, SHAN P, et al. Generating reconfiguration blueprints for IMA systems based on improved Q-learning[J]. Acta Aeronautica et Astronautica Sinica202142(8): 525792 (in Chinese).
23 JI J J, GUO Y N, GAO X Z, et al. Q-learning-based hyperheuristic evolutionary algorithm for dynamic task allocation of crowdsensing[J/OL]. IEEE Transactions on Cybernetics, (2021-10-04)[2022-08-03]. .
24 ZHENG T, WAN J, ZHANG J L, et al. Deep reinforcement learning-based workload scheduling for edge computing[J]. Journal of Cloud Computing202211(1): 3.
25 ZITOUNI F, MAAMRI R. Cooperative learning-agents for task allocation problem[C]∥Interactive Mobile Communication, Technologies and Learning. Berlin: Springer, 2018: 952-968.
26 ZHU P X, FANG X. Multi-UAV cooperative task assignment based on half random Q-learning[J]. Symmetry202113(12): 2417.
文章导航

/