[1] 王祥科, 刘志宏, 丛一睿, 等. 小型固定翼无人机集群综述和未来发展[J]. 航空学报, 2020, 41(4):023732. WANG X K, LIU Z H, CONG Y R,et al. Miniature fixed-wing UAV swarms:Review and outlook[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(4):023732(in Chinese). [2] 周浦城, 洪炳镕. 基于对策论的群机器人追捕-逃跑问题研究[J]. 哈尔滨工业大学学报, 2003, 35(9):1056-1059. ZHOU P C, HONG B R. Grouprobot pursuit-evasion problem based on game theory[J]. Journal of Harbin Institute of Technology, 2003, 35(9):1056-1059(in Chinese). [3] 周浦城, 洪炳镕, 王月海. 动态环境下多机器人合作追捕研究[J]. 机器人, 2005, 27(4):289-295, 300. ZHOU P C, HONG B R, WANG Y H. Multi-robot cooperative pursuit under dynamic environment[J]. Robot, 2005, 27(4):289-295, 300(in Chinese). [4] 方宝富, 潘启树, 洪炳镕, 等. 多追捕者-单-逃跑者追逃问题实现成功捕获的约束条件[J]. 机器人, 2012, 34(3):282-291. FANG B F, PAN Q S, HONG B R, et al. Constraintconditions of successful capture in multi-pursuers vs one-evader games[J]. Robot, 2012, 34(3):282-291(in Chinese). [5] 崔一鸣. 多机器人协作的关键技术研究[D]. 南京:南京理工大学, 2008. CUI Y M. Key technologies of multi-robot coordination and cooperation[D]. Nanjing:Nanjing University of Science and Technology, 2008(in Chinese). [6] 熊伟. 多自主水下机器人目标搜索与协同围捕研究[D]. 哈尔滨:哈尔滨工程大学, 2008. XIONG W. Research on target searching and cooperative hunting for autonomous underwater vehicles[D]. Harbin:Harbin Engineering University, 2008(in Chinese). [7] 方宝富. 多机器人追捕关键技术研究[D]. 哈尔滨:哈尔滨工业大学, 2013. FANG B F. Research on key technologies of multi robot pursuit[D]. Harbin:Harbin Institute of Technology, 2013(in Chinese). [8] 陈灿, 莫雳, 郑多, 等. 非对称机动能力多无人机智能协同攻防对抗[J]. 航空学报, 2020, 41(12):324152. CHEN C, MO L, ZHENG D,et al. Cooperative attack-defense game of multiple UAVs with asymmetric maneuverability[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(12):324152(in Chinese). [9] LIUBARSHCHUK I, ALTHÖFER I. The problem of approach in differential-difference games[J]. International Journal of Game Theory, 2016, 45(3):511-522. [10] EGOROV M. Multi-agent deep reinforcement learning[EB/OL]. http://cs231n.stanford.edu/reports/2016/pdfs/122_Report.pdf.2016. [11] 孙长银, 穆朝絮. 多智能体深度强化学习的若干关键科学问题[J]. 自动化学报, 2020, 46(7):1301-1312. SUN C Y, MUC X. Important scientific problems of multi-agent deep reinforcement learning[J]. Acta Automatica Sinica, 2020, 46(7):1301-1312(in Chinese). [12] 孙彧, 曹雷, 陈希亮, 等. 多智能体深度强化学习研究综述[J]. 计算机工程与应用, 2020, 56(5):13-24. SUN Y, CAO L, CHEN X L, et al. Overview ofmulti-agent deep reinforcement learning[J]. Computer Engineering and Applications, 2020, 56(5):13-24(in Chinese). [13] 陈亮, 梁宸, 张景异, 等. Actor-Critic框架下一种基于改进DDPG的多智能体强化学习算法[J]. 控制与决策, 2021, 36(1):75-82. CHEN L, LIANG C, ZHANG J Y, et al. A multi-agent reinforcement learning algorithm based on improved DDPG in Actor-Critic framework[J]. Control and Decision, 2021, 36(1):75-82(in Chinese). [14] 杜威, 丁世飞. 多智能体强化学习综述[J]. 计算机科学, 2019, 46(8):1-8. DU W, DING S F. Overview onmulti-agent reinforcement learning[J]. Computer Science, 2019, 46(8):1-8(in Chinese). [15] 高昂, 董志明, 李亮, 等. MADDPG算法并行优先经验回放机制[J]. 系统工程与电子技术, 2021, 43(2):420-433. GAO A, DONG Z M, LI L, et al. Parallel priority experience replay mechanism of MADDPG algorithm[J]. Systems Engineering and Electronics, 2021, 43(2):420-433(in Chinese). [16] 舒扬. 多智能体协同控制关键算法研究与应用[D]. 成都:电子科技大学, 2019. SHU Y. Research and application of algorithms for multi-agent cooperative control[D]. Chengdu:University of Electronic Science and Technology of China, 2019(in Chinese). [17] 王桂鸿. 合作型多智能体中的深度强化学习研究[D]. 广州:华南理工大学, 2019. WANG G H. Research on deep reinforcement learning in cooperative multi-agent system[D]. Guangzhou:South China University of Technology, 2019(in Chinese). [18] LOWE R, WU Y, TAMAR A, et al. Multi-agent actor-critic for mixed cooperative-competitive environments[DB/OL]. arXiv pre-print:1706.2275, 2017. [19] 桂熙. 基于MADDPG算法的多智能体协同控制研究[D]. 武汉:武汉纺织大学, 2020. GUI X. Research on multi-agent cooperative control based on MADDPG algorithm[D]. Wuhan:Wuhan Textile University, 2020(in Chinese). [20] 何明, 张斌, 柳强, 等. MADDPG算法经验优先抽取机制[J]. 控制与决策, 2021, 36(1):68-74. HE M, ZHANG B, LIU Q, et al. Multi-agent deep deterministic policy gradient algorithm vi a priori tized experience selected method[J]. Control and Decision, 2021, 36(1):68-74(in Chinese). [21] SHEIKH H U,BÖLÖNI L. Multi-agent reinforcement learning for problems with combined individual and team reward[C]//2020 International Joint Conference on Neural Networks (IJCNN), 2020:1-8. [22] YANG J, NAKHAEI A, ISELE D, et al. CM3:cooperative mul-ti-goal multi-stage multi-agent reinforcement[EB/OL]. arXiv pre-print arXiv:1809.05188, 2018. [23] SHEIKH H U,BÖLÖNI L. Designing a multi-objective reward function for creating teams of robotic bodyguards using deep reinforcement learning[C]//35th International Conference on Maching Learning, 2019. [24] 张耀中, 许佳林, 姚康佳, 等. 基于DDPG算法的无人机集群追击任务[J]. 航空学报, 2020, 41(10):324000. ZHANG Y Z, XU J L, YAO K J,et al. Pursuit missions for UAV swarms based on DDPG algorithm[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(10):324000(in Chinese). [25] WANG Y D, DONG L, SUN C Y. Cooperative control for multi-player pursuit-evasion games with reinforcement learning[J]. Neurocomputing, 2020, 412:101-114. [26] 马俊冲. 基于多机器人系统的多目标围捕协同控制问题研究[D]. 长沙:国防科技大学, 2018. MA J C. Research on encirclement control for A group of targets by multi-robot system[D]. Changsha:National University of Defense Technology, 2018(in Chinese). [27] ZHU J G, ZOU W, ZHU Z. Learningevasion strategy in pursuit-evasion by deep Q-network[C]//201824th International Conference on Pattern Recognition (ICPR). Piscataway:IEEE Press, 2018:67-72. |