Electronics and Electrical Engineering and Control

Time-varying formation control for heterogeneous clusters with switching topologies via reinforcement learning

  • Jiaxiu YANG ,
  • Xinkai LI ,
  • Hongli ZHANG ,
  • Hao WANG
Expand
  • School of Electrical Engineering,Xinjiang University,Urumqi 830017,China
E-mail: lxk@xju.edu.cn

Received date: 2023-06-13

  Revised date: 2023-06-26

  Accepted date: 2023-08-23

  Online published: 2023-09-01

Supported by

National Natural Science Foundation of China(62263030);Youth Project of Natural Science Foundation of Xinjiang Uygur Autonomous Region(2022D01C86)

Abstract

To address the problem of time-varying formation control of high-order heterogeneous unmanned cluster systems with uncertain system model dynamics and switching communication topology, an optimal distributed hierarchical formation control method is proposed based on integral reinforcement learning. The time-varying formation control problem for heterogeneous cluster systems is transformed into a stabilization problem by using time-varying formation switching vectors to construct an augmented system of multi-quadrotor Unmanned Aircraft System (UAS) with multi-unmanned vehicle systems. The value function with discount factor is introduced to transform the stabilization problem of the heterogeneous clustered system into an optimal control problem. Only the feedback gain parameters are replaced and averaged to obtain the optimal time-varying formation switching control protocol for the whole heterogeneous cluster without destroying the consistent distributed formation control protocol. The feedback gain of the distributed time-varying formation switching controller is updated online in real time using a single-network “actor network-critic network” structure, combined with the integral reinforcement learning algorithm and the distributed control method. The effectiveness and superiority of the proposed control scheme are verified by theoretical proof and simulation experiments.

Cite this article

Jiaxiu YANG , Xinkai LI , Hongli ZHANG , Hao WANG . Time-varying formation control for heterogeneous clusters with switching topologies via reinforcement learning[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2024 , 45(10) : 329166 -329166 . DOI: 10.7527/S1000-6893.2023.29166

References

1 MEHMOOD A, IQBAL Z, SHAH A ALI, et al. An intelligent cluster-based communication system for multi-unmanned aerial vehicles for searching and rescuing[J]. Electronics202312(3): 607.
2 WANG Y W, WEI Y W, LIU X K, et al. Optimal persistent monitoring using second-order agents with physical constraints[J]. IEEE Transactions on Automatic Control201964(8): 3239-3252.
3 SERVIDIA P A, ESPA?A M. On autonomous reconfiguration of SAR satellite formation flight with continuous control[J]. IEEE Transactions on Aerospace and Electronic Systems202157(6): 3861-3873.
4 ALI Z A, HAN Z G. Multi-unmanned aerial vehicle swarm formation control using hybrid strategy[J]. Transactions of the Institute of Measurement and Control202143(12): 2689-2701.
5 SASKA M, HERT D, BACA T, et al. Formation control of unmanned micro aerial vehicles for straitened environments[J]. Autonomous Robots202044(6): 991-1008.
6 DONG X W, HU G Q. Time-varying formation control for general linear multi-agent systems with switching directed topologies[J]. Automatica201673: 47-55.
7 LIU W, ZHOU S L, QI Y H, et al. Distributed formation control for multiple unmanned aerial vehicles with directed switching communication topologies[J]. Control Theory&Applications201532(10): 1422-1427.
8 KARIMODDINI A, LIN H, CHEN B M, et al. Hybrid three-dimensional formation control for unmanned helicopters[J]. Automatica (Journal of IFAC)201349(2): 424-433.
9 吴宇, 梁天骄. 基于改进一致性算法的无人机编队控制[J]. 航空学报202041(9): 323848.
  WU Y, LIANG T J. Improved consensus-based algorithm for unmanned aerial vehicle formation control[J]. Acta Aeronautica et Astronautica Sinica202041(9): 323848 (in Chinese).
10 OH K K, PARK M C, AHN H S. A survey of multi-agent formation control[J]. Automatica201553: 424-440.
11 魏志强, 翁哲鸣, 化永朝, 等. 切换拓扑下异构无人集群编队-合围跟踪控制[J]. 航空学报202344(2): 326504.
  WEI Z Q, WENG Z M, HUA Y Z, et al. Formation-containment tracking control for heterogeneous unmanned swarm systems with switching topologies[J]. Acta Aeronautica et Astronautica Sinica202344(2): 326504 (in Chinese).
12 LINDEMUTH M, MURPHY R, STEIMLE E, et al. Sea robot-assisted inspection[J]. IEEE Robotics & Automation Magazine201118(2): 96-107.
13 WEI W, WANG J J, FANG Z R, et al. 3U: Joint design of UAV-USV-UUV networks for cooperative target hunting[J]. IEEE Transactions on Vehicular Technology202372(3): 4085-4090.
14 田磊, 董希旺, 赵启伦, 等. 异构集群系统分布式自适应输出时变编队跟踪控制[J]. 自动化学报202147(10): 2386-2401.
  TIAN L, DONG X W, ZHAO Q L, et al. Distributed adaptive time-varying output formation tracking for heterogeneous swarm systems[J]. Acta Automatica Sinica202147(10): 2386-2401 (in Chinese).
15 马亚杰, 王娟, 姜斌, 等. 一种无人机-无人车编队系统容错控制方法[J]. 航空学报202344(8): 327216.
  MA Y J, WANG J, JIANG B, et al. A fault-tolerant control scheme for UAVs-UGVs formation systems[J]. Acta Aeronautica et Astronautica Sinica202344(8): 327216 (in Chinese).
16 DONG X W, LI Q D, ZHAO Q L, et al. Time-varying group formation analysis and design for second-order multi-agent systems with directed topologies[J]. Neurocomputing2016205: 367-374.
17 XIE Y J, LIN Z L. Global optimal consensus for higher-order multi-agent systems with bounded controls[J]. Automatica201999: 301-307.
18 LIU J, LI P, CHEN W, et al. Distributed formation control of fractional-order multi-agent systems with relative damping and nonuniform time-delays[J]. ISA Transactions201993: 189-198.
19 XU Y, LI D Y, LUO D L, et al. Two-layer distributed hybrid affine formation control of networked Euler-Lagrange systems[J]. Journal of the Franklin Institute2019356(4): 2172-2197.
20 NIAN X H, SU S J, PAN H. Consensus tracking protocol and formation control of multi-agent systems with switching topology[J]. Journal of Central South University of Technology201118(4): 1178-1183.
21 DONG X W, ZHOU Y, REN Z, et al. Time-varying formation control for unmanned aerial vehicles with switching interaction topologies[J]. Control Engineering Practice201646: 26-36.
22 DONG X W, SHI Z Y, LU G, et al. Time-varying formation control for high-order linear swarm systems with switching interaction topologies[J]. IET Control Theory & Applications20148(18): 2162-2170.
23 向锦武, 董希旺, 丁文锐, 等. 复杂环境下无人集群系统自主协同关键技术[J]. 航空学报202243(10): 527570.
  XIANG J W, DONG X W, DING W R, et al. Key technologies for autonomous cooperation of unmanned swarm systems in complex environments[J]. Acta Aeronautica et Astronautica Sinica202243(10): 527570 (in Chinese).
24 王琳, 张庆杰, 陈宏伟. 满足LQR指标的群系统编队形成问题优化控制方法[J]. 航空学报202243(S1): 726902.
  WANG L, ZHANG Q J, CHEN H W. Optimal control method for swarm systems formation with LQR performance index [J]. Acta Aeronautica et Astronautica Sinica202243(S1): 726902 (in Chinese).
25 HU J Y, LANZON A. Cooperative adaptive time-varying formation tracking for multi-agent systems with LQR performance index and switching directed topologies[C]∥2018 IEEE Conference on Decision and Control. Piscataway: IEEE Press, 2018: 5102-5107.
26 YANG X K, WANG W, HUANG P. Distributed optimal consensus with obstacle avoidance algorithm of mixed-order UAVs-USVs-UUVs systems[J]. ISA Transactions2020107: 270-286.
27 赵斐然, 游科友. 数据驱动的策略优化控制律设计最新研究综述[J]. 中国科学: 信息科学202353(6): 1027-1049.
  ZHAO F R, YOU K Y. Survey of recent progress in data-driven policy optimization for controller design[J]. Scientia Sinica (Informationis)202353(6): 1027-1049 (in Chinese).
28 MODARES H, LEWIS F L. Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning[J]. IEEE Transactions on Automatic Control201459(11): 3051-3056.
29 ZHU L M, MODARES H, PEEN G O, et al. Adaptive suboptimal output-feedback control for linear systems using integral reinforcement learning[J]. IEEE Transactions on Control Systems Technology201523(1): 264-273.
30 庞文砚, 范家璐, 姜艺, 等. 基于强化学习的部分线性离散时间系统的最优输出调节[J]. 自动化学报202248(9): 2242-2253.
  PANG W Y, FAN J L, JIANG Y, et al. Optimal output regulation of partially linear discrete-time systems using reinforcement learning[J]. Acta Automatica Sinica202248(9): 2242-2253 (in Chinese).
31 MODARES H, LEWIS F L, KANG W, et al. Optimal synchronization of heterogeneous nonlinear systems with unknown dynamics[J]. IEEE Transactions on Automatic Control201863(1): 117-131.
32 YANG Y L, MODARES H, WUNSCH D C, et al. Leader-follower output synchronization of linear heterogeneous systems with active leader using reinforcement learning[J]. IEEE Transactions on Neural Networks and Learning Systems201829(6): 2139-2153.
33 LIU H, MENG Q Y, PENG F C, et al. Heterogeneous formation control of multiple UAVs with limited-input leader via reinforcement learning[J]. Neurocomputing2020412: 63-71.
34 WANG K, MU C X. Learning-based control with decentralized dynamic event-triggering for vehicle systems[J]. IEEE Transactions on Industrial Informatics202319(3): 2629-2639.
35 AWEYA J, OUELLETTE M, MONTUNO D Y. Design and stability analysis of a rate control algorithm using the Routh-Hurwitz stability criterion[J]. IEEE/ACM Transactions on Networking200412(4): 719-732.
36 GAO Y P, WANG L. Sampled-data based consensus of continuous-time multi-agent systems with time-varying topology[J]. IEEE Transactions on Automatic Control201156(5): 1226-1231.
37 TUTSOY O, BARKANA D E, TUGAL H. Design of a completely model free adaptive control in the presence of parametric, non-parametric uncertainties and random control signal delay[J]. ISA Transactions201876: 67-77.
Outlines

/