针对多颗微小卫星合作接管失效卫星姿态运动的问题,研究了考虑微小卫星控制约束的多星合作博弈策略学习与协同控制方法。首先,建立了微小卫星合作博弈模型,给出了能够处理微小卫星控制约束的多星合作博弈帕累托最优策略显式表达式。其次,针对微小卫星合作博弈策略学习需求,通过过去与当前时刻数据的并行使用,设计了基于并行学习的策略迭代方法,该方法放松了神经网络(NN)权值矢量学习对持续激励条件的要求。给出了为确保神经网络权值矢量估值收敛,所使用的过去时刻数据所需满足的条件,并通过Lyapunov方法分析了神经网络权值矢量估计误差的一致最终有界性。之后,采用并行学习策略迭代方法进行了微小卫星合作博弈帕累托最优策略数值解的逼近。所获得的合作博弈策略具有反馈控制形式,在进行神经网络权值矢量学习后,各微小卫星能够通过合作博弈策略的独立计算实现失效卫星姿态运动接管过程中的闭环协同控制。所设计方法避免了传统姿态控制方法所需进行的力矩分配,消除了微小卫星数量对其控制计算复杂度的影响。最后,通过数值仿真对所设计方法的有效性进行了验证。
Taking the control constraint of microsatellites into consideration, this paper studies the cooperative game strategy learning and coordinated control problems of multiple microsatellites cooperatively taking over the attitude movement of a failed satellite. The cooperative game model of microsatellites is first established, providing the explicit expression of the Pareto optimal strategy that can handle the control constraint of microsatellites. Secondly, to learn the Pareto optimal strategy of microsatellites, we design a concurrent learning based Policy Iteration (PI) method free of the Persistency of Excitation (PE) condition by concurrent use of the past and current data. The condition that the used past data needs to satisfy to ensure the convergence of the Neural Network (NN) weight estimations is proposed, and the Uniform Ultimate Boundedness (UUB) of NN weight estimation errors is analyzed using the Lyapunov method. Then, the Pareto optimal strategy of microsatellites is approximated using the concurrent learning PI method. The obtained cooperative game strategy of microsatellites has a feedback control form, and after the NN weight learning is accomplished, the closed-loop coordinated control of microsatellites can be realized during the takeover control through independent game strategy calculation. Since the need of torque allocation required by traditional attitude control methods is avoided, the computational complexity of the method is independent of the number of microsatellites. Finally, numerical simulations are conducted to validate the effectiveness of the developed method.
[1] JAEGER T, MIRCZAK W. Satlets-the building blocks of future satellites-and which mold do you use?[C]//AIAA SPACE 2013 Conference and Exposition. Reston:AIAA, 2013.
[2] JOHNSON L K, HOLLMAN J, MCCLELLAN J, et al. Utilizing cubesat architecture and innovative low-complexity devices to repurpose decommissioned apertures for RF communications[C]//AIAA SPACE 2013 Conference and Exposition. Reston:AIAA, 2013.
[3] GOELLER M, OBERLAENDER J, UHL K, et al. Modular robots for on-orbit satellite servicing[C]//2012 IEEE International Conference on Robotics and Biomimetics (ROBIO). Piscataway:IEEE Press, 2012:2018-2023.
[4] WEISE J, BRIE K, ADOMEIT A, et al. An intelligent building blocks concept for on-orbit-satellite servicing[C]//Proceedings of the International Symposium on Artificial Intelligence, Robotics and Automation in Space (iSAIRAS), 2012.
[5] 夏冬冬, 岳晓奎. 基于浸入与不变理论的航天器姿态跟踪自适应控制[J]. 航空学报, 2020, 41(2):323428. XIA D D, YUE X K. Immersion and invariance based attitude adaptive tracking control for spacecraft[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(2):323428(in Chinese).
[6] WANG Z, YUAN J J, CHE D J. Adaptive attitude takeover control for space non-cooperative targets with stochastic actuator faults[J]. Optik, 2017, 137:279-290.
[7] 殷泽阳, 罗建军, 魏才盛, 等. 非合作航天器姿态接管无辨识预设性能控制[J]. 航空学报, 2018, 39(11):322011. YIN Z Y, LUO J J, WEI C S, et al. Estimation-free and prescribed performance control of attitude takeover for non-cooperative spacecraft[J]. Acta Aeronautica et Astronautica Sinica, 2018, 39(11):322011(in Chinese).
[8] HU Q L, LI B, ZHANG A H. Robust finite-time control allocation in spacecraft attitude stabilization under actuator misalignment[J]. Nonlinear Dynamics, 2013, 73(1-2):53-71.
[9] CHANG H T, HUANG P F, ZHANG Y Z, et al. Distributed control allocation for spacecraft attitude takeover control via cellular space robot[J]. Journal of Guidance, Control, and Dynamics, 2018, 41(11):2495-2502.
[10] 王雨琪, 宁国栋, 王晓峰, 等. 基于微分对策的临近空间飞行器机动突防策略[J]. 航空学报, 2019, 40(S2):324276. WANG Y Q, NING G D, WANG X F, et al. Maneuver penetration strategy of near space vehicle based on differential game[J]. Acta Aeronautica et Astronautica Sinica, 2019, 40(S2):324276(in Chinese).
[11] VAMVOUDAKIS K G, LEWIS F L. Multi-player non-zero-sum games:online adaptive learning solution of coupled Hamilton-Jacobi equations[J]. Automatica, 2011, 47(8):1556-1569.
[12] 韩楠, 罗建军, 柴源. 多颗微小卫星接管失效航天器姿态运动的微分博弈学习控制[J]. 中国科学:信息科学, 2020, 50(4):588-602. HAN N, LUO J J, CHAI Y. Differential game learning approach for multiple microsatellites takeover of the attitude movement of failed spacecraft[J]. Scientia Sinica Informationis, 2020, 50(4):588-602(in Chinese).
[13] 柴源, 罗建军, 韩楠. 失效航天器姿态接管的SDRE微分博弈控制[J]. 宇航学报, 2020, 41(2):191-198. CHAI Y, LUO J J, HAN N. Attitude takeover control of failed spacecraft using SDRE based differential game approach[J]. 2020, 41(2):191-198(in Chinese).
[14] 罗建军, 韩楠, 柴源. 基于微小卫星合作博弈的失效航天器姿态接管控制[J]. 飞控与探测, 2019, 2(3):1-9. LUO J J, HAN N, CHAI Y. Taking over attitude control of failed spacecraft through cooperative game among multiple microsatellite[J]. Flight Control & Detection, 2019, 2(3):1-9(in Chinese).
[15] SHUSTER M D. A survey of attitude representations[J]. Journal of the Astronautical Sciences, 1993, 41(4):439-517.
[16] ABU-KHALAF M, LEWIS F L. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach[J]. Automatica, 2005, 41(5):779-791.
[17] BEARD R W, SARIDIS G N, WEN J T. Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation[J]. Automatica, 1997, 33(12):2159-2177.
[18] FARINA M, AMATO P. On the optimal solution definition for many-criteria optimization problems[C]//2002 Annual Meeting of the North American Fuzzy Information Processing Society Proceedings, 2002:233-238.
[19] LIU D R, YANG X, WANG D, et al. Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints[J]. IEEE Transactions on Cybernetics, 2015, 45(7):1372-1385.
[20] CHOWDHARY G, JOHNSON E. Concurrent learning for convergence in adaptive control without persistency of excitation[C]//49th IEEE Conference on Decision and Control, Atlanta. Piscataway:IEEE Press, 2010:3674-3679.