ACTA AERONAUTICAET ASTRONAUTICA SINICA >
Target round-up control for multi-agent systems based on reinforcement learning
Received date: 2022-05-21
Revised date: 2022-06-23
Accepted date: 2022-07-01
Online published: 2022-07-08
Supported by
National Natural Science Foundation of China(61673200);Shandong Province Major Basic Research Project(ZR2018ZC0438)
A target round-up control method for multi-agent systems is proposed based on reinforcement learning. Firstly, Markov game modeling for multi-agent systems is carried out. The potential energy function which meets the requirements of arriving at the desired state and avoiding obstacles is designed according to the task of rounding up, and reinforcement learning principles are combined with the model control. The round-up is performed using multi-agent reinforcement learning guided by the potential energy model. Secondly, based on the existing potential energy model, two surrounding strategies are established: tracking round-up and circumnavigation round-up. With the first strategy, consistent tracking of multiple agents is achieved by designing the potential energy function of velocity. In the second strategy, virtual circumnavigation points are added to design potential energy functions, achieving desired circumnavigation. Finally, the effectiveness of the round-up control based on multi-agent reinforcement learning is verified by simulation.
Zhilin FAN , Hongyong YANG , Yilin HAN . Target round-up control for multi-agent systems based on reinforcement learning[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023 , 44(S1) : 727487 -727487 . DOI: 10.7527/S1000-6893.2022.27487
1 | WANG X K, ZENG Z W, CONG Y R. Multi-agent distributed coordination control: Developments and directions via graph viewpoint[J]. Neurocomputing, 2016, 199: 204-218. |
2 | 孙筵龙, 何俊, 邢琰. 轮腿式火星探测机器人的多目标协同控制[J]. 航空学报, 2021, 42(1): 524246. |
SUN Y L, HE J, XING Y. Multi-target coordinated control of wheel-legged Mars rover[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(1): 524246 (in Chinese). | |
3 | JIN L, LI S, LA H M, et al. Dynamic task allocation in multi-robot coordination for moving target tracking: A distributed approach[J]. Automatica, 2019, 100: 75-81. |
4 | 叶结松, 龚柏春, 李爽, 等. 基于相对方位信息和单间距测量的多智能体编队协同控制[J]. 航空学报, 2021, 42(7): 324610. |
YE J S, GONG B C, LI S, et al. Multi-agent formation cooperative control using relative bearing information and single-spacing measurement[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(7): 324610 (in Chinese). | |
5 | SHI Y J, LI R, TEO K L. Cooperative enclosing control for multiple moving targets by a group of agents[J]. International Journal of Control, 2015, 88(1): 80-89. |
6 | 王巍, 宗光华. 基于“虚拟范围”的多机器人围捕算法[J]. 航空学报, 2007, 28(2): 508-512. |
WANG W, ZONG G H. Hunting algorithm for multi-mobile robot system based on virtual range[J]. Acta Aeronautica et Astronautica Sinica, 2007, 28(2): 508-512 (in Chinese). | |
7 | WANG Y H, LIU Y F, WANG Z. Theory and experiments on enclosing control of multi-agent systems[J]. IEEE/CAA Journal of Automatica Sinica, 2021, 8(10): 1677-1685. |
8 | XU B W, ZHANG H T, MENG H F, et al. Moving target surrounding control of linear multiagent systems with input saturation[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2022, 52(3): 1705-1715. |
9 | 华庆光. 基于势场法的多机器人系统编队控制及其在目标围捕中的应用[D]. 扬州: 扬州大学, 2017: 37-49. |
HUA Q G. Formation control of multi-robot system based on artificial potential field and its application in target enclosing[D]. Yangzhou: Yangzhou University, 2017: 37-49. (in Chinese) | |
10 | 李瑞珍, 杨惠珍, 萧丛杉. 基于动态围捕点的多机器人协同策略[J]. 控制工程, 2019, 26(3): 510-514. |
LI R Z, YANG H Z, XIAO C S. Cooperative hunting strategy for multi-mobile robot systems based on dynamic hunting points[J]. Control Engineering of China, 2019, 26(3): 510-514 (in Chinese). | |
11 | 黄天云, 陈雪波, 徐望宝, 等. 基于松散偏好规则的群体机器人系统自组织协作围捕[J]. 自动化学报, 2013, 39(1): 57-68. |
HUANG T Y, CHEN X B, XU W B, et al. A self-organizing cooperative hunting by swarm robotic systems based on loose-preference rule[J]. Acta Automatica Sinica, 2013, 39(1): 57-68 (in Chinese). | |
12 | GUO J, YAN G F, LIN Z Y. Local control strategy for moving-target-enclosing under dynamically changing network topology[J]. Systems & Control Letters, 2010, 59(10): 654-661. |
13 | CHEN G Y, FU W M, KANG Y, et al. Circular motion of multiple nonholonomic robots under switching topology with ordinal ranking[J]. Journal of the Franklin Institute, 2020, 357(15): 10737-10756. |
14 | DOU L Y, SONG C, WANG X F, et al. Target localization and enclosing control for networked mobile agents with bearing measurements[J]. Automatica, 2020, 118: 109022. |
15 | SILVA M A L, DE SOUZA S R, SOUZA M J F, et al. A reinforcement learning-based multi-agent framework applied for solving routing and scheduling problems[J]. Expert Systems with Applications, 2019, 131(10): 148-171. |
16 | FENG T, ZHANG J L, TONG Y, et al. Q-learning algorithm in solving consensusability problem of discrete-time multi-agent systems[J]. Automatica, 2021, 128: 109576. |
17 | 吴子沉, 胡斌. 基于态势认知的无人机集群围捕方法[J]. 北京航空航天大学学报, 2021, 47(2): 424-430. |
WU Z C, HU B. Swarm rounding up method of UAV based on situation cognition[J]. Journal of Beijing University of Aeronautics and Astronautics, 2021, 47(2): 424-430 (in Chinese). | |
18 | 刘峰, 魏瑞轩, 丁超, 等. 面向多机协同的Att-MADDPG围捕控制方法设计[J]. 空军工程大学学报(自然科学版), 2021, 22(3): 9-14. |
LIU F, WEI R X, DING C, et al. Design of Att-MADDPG hunting control method for multi-UAV cooperation[J]. Journal of Air Force Engineering University (Natural Science Edition), 2021, 22(3): 9-14 (in Chinese). | |
19 | LITTMAN M L. Markov games as a framework for multi-agent reinforcement learning[M]∥Machine Learning Proceedings 1994. Amsterdam: Elsevier, 1994: 157-163. |
20 | 马俊冲. 基于多机器人系统的多目标围捕协同控制问题研究[D]. 长沙: 国防科技大学, 2018: 29-34. |
MA J C. Research on encirclement control for a group of targets by multi-robot system[D]. Changsha: National University of Defense Technology, 2018: 29-34 (in Chinese). |
/
〈 |
|
〉 |