电子电气工程与控制

基于机器学习的航天器规避目标威胁博弈决策

  • 张鸿林 ,
  • 罗建军 ,
  • 马卫华
展开
  • 1.西北工业大学 航天学院,西安  710072
    2.航天飞行动力学技术重点实验室,西安  710072
.E-mail: jjluo@nwpu.edu.cn

收稿日期: 2023-06-06

  修回日期: 2023-08-22

  录用日期: 2023-11-02

  网络出版日期: 2023-11-16

基金资助

国家自然科学基金(12072269);航天飞行动力学技术重点实验室基金(6142210210302)

Spacecraft game decision making for threat avoidance of space targets based on machine learning

  • Honglin ZHANG ,
  • Jianjun LUO ,
  • Weihua MA
Expand
  • 1.School of Astronautics,Northwestern Polytechnical University,Xi’an  710072,China
    2.Science and Technology on Aerospace Flight Dynamics Laboratory,Xi’an  710072,China

Received date: 2023-06-06

  Revised date: 2023-08-22

  Accepted date: 2023-11-02

  Online published: 2023-11-16

Supported by

National Natural Science Foundation of China(12072269);Foundation of Science and Technology on Aerospace Flight Dynamics Laboratory(6142210210302)

摘要

针对航天器规避空间目标抵近威胁的决策问题,提出了一种智能决策框架和基于深度强化学习的自主决策方法。考虑到空间目标的机动特性和威胁规避的博弈性,基于感知-判断-决策-执行(OODA)环决策思想和机器学习方法,提出了一种航天器威胁规避智能博弈决策框架。基于该框架和对空间目标运动意图的推理,为了使航天器决策控制具备博弈应对能力,设计了基于深度强化学习的航天器机动决策算法和训练环境,实现了对空间目标典型运动意图的规避应对;进一步地,采用自我博弈学习训练提升航天器自主机动决策算法的泛化性和应对目标不确定机动的适应能力。最后,通过算例仿真及分析,验证了所提方法的有效性。

本文引用格式

张鸿林 , 罗建军 , 马卫华 . 基于机器学习的航天器规避目标威胁博弈决策[J]. 航空学报, 2024 , 45(8) : 329136 -329136 . DOI: 10.7527/S1000-6893.2023.29136

Abstract

An intelligent decision-making framework and a deep reinforcement learning-based autonomous decision-making method are proposed for the spacecraft decision-making in avoiding the threat of space targets. Taking into account the maneuvering characteristics of space targets and the gameplay of threat avoidance, an intelligent game decision-making framework for spacecraft threat avoidance is proposed based on the Observation-Orientation-Decision-Action (OODA) loop decision-making idea and machine learning techniques. Based on this framework and inference on the motion intentions of space targets, a deep reinforcement learning-based spacecraft maneuver decision-making algorithm and training environment are designed to enable spacecraft decision-making control with game response capability, which realizes the avoidance response to the typical motion intentions of space targets. Furthermore, the generalization of spacecraft autonomous maneuvering decision-making algorithm and its adaptability to possible uncertain maneuvers of space targets are improved by using the self-play learning technique. Finally, the effectiveness of our proposed method is verified through simulations.

参考文献

1 袁利, 姜甜甜. 航天器威胁规避智能自主控制技术研究综述[J]. 自动化学报202349(2): 229-245.
  YUAN L, JIANG T T. Review on intelligent autonomous control for spacecraft confronting orbital threats[J]. Acta Automatica Sinica202349(2): 229-245 (in Chinese).
2 袁利. 面向不确定环境的航天器智能自主控制技术[J]. 宇航学报202142(7): 839-849.
  YUAN L. Spacecraft intelligent autonomous control technology toward uncertain environment[J]. Journal of Astronautics202142(7): 839-849 (in Chinese).
3 王杰, 丁达理, 董康生, 等. UCAV自主空战战术机动动作建模与轨迹生成[J]. 火力与指挥控制201843(12): 42-49.
  WANG J, DING D L, DONG K S, et al. UCAV autonomous air combat tactical maneuvering modeling and trajectory generation[J]. Fire Control & Command Control201843(12): 42-49 (in Chinese).
4 于大腾, 王华, 孙福煜. 考虑潜在威胁区的航天器最优规避机动策略[J]. 航空学报201738(1): 320202.
  YU D T, WANG H, SUN F Y. Optimal evasive maneuver strategy with potential threatening area being considered[J]. Acta Aeronautica et Astronautica Sinica201738(1): 320202 (in Chinese).
5 BOMBARDELLI C. Analytical formulation of impulsive collision avoidance dynamics[J]. Celestial Mechanics and Dynamical Astronomy2014118(2): 99-114.
6 GONZALO J L, COLOMBO C, DI LIZIA P. Analytical framework for space debris collision avoidance maneuver design[J]. Journal of Guidance, Control, and Dynamics202044(3): 469-487.
7 BATHER J A, ISAACS R. Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization[J]. Journal of the Royal Statistical Society Series A (General)1966129(3): 474.
8 PRINCE E R, HESS J A, COBB R G, et al. Elliptical orbit proximity operations differential games[J]. Journal of Guidance, Control, and Dynamics201942(7): 1458-1472.
9 LIANG L, DENG F, PENG Z H, et al. A differential game for cooperative target defense[J]. Automatica2019102: 58-71.
10 SUN J L, LIU C S, YE Q. Robust differential game guidance laws design for uncertain interceptor-target engagement via adaptive dynamic programming[J]. International Journal of Control201790(5): 990-1004.
11 WATANABE T, JOHNSON E N. Trajectory generation using deep neural network: AIAA-2018-1893[R]. Reston: AIAA, 2018.
12 IZZO D, TAILOR D, VASILEIOU T. On the stability analysis of deep neural network representations of an optimal state-feedback[DB/OL]. arXiv preprint1812.02532, 2018.
13 SáNCHEZ-SáNCHEZ C, IZZO D. Real-time optimal control via deep neural networks: Study on landing problems[J]. Journal of Guidance, Control, and Dynamics201841(5): 1122-1135.
14 OESTREICH C E, LINARES R, GONDHALEKAR R. Autonomous six-degree-of-freedom spacecraft docking with rotating targets via reinforcement learning[J]. Journal of Aerospace Information Systems202118(7): 417-428.
15 刘冰雁, 叶雄兵, 高勇, 等. 基于分支深度强化学习的非合作目标追逃博弈策略求解[J]. 航空学报202041(10): 324040.
  LIU B Y, YE X B, GAO Y, et al. Strategy solution of non-cooperative target pursuit-evasion game based on branching deep reinforcement learning[J]. Acta Aeronautica et Astronautica Sinica202041(10): 324040 (in Chinese).
16 ZHANG J R, ZHANG K P, ZHANG Y, et al. Near-optimal interception strategy for orbital pursuit-evasion using deep reinforcement learning[J]. Acta Astronautica2022198: 9-25.
17 赵毓, 郭继峰, 颜鹏, 等. 稀疏奖励下多航天器规避决策自学习仿真[J]. 系统仿真学报202133(8): 1766-1774.
  ZHAO Y, GUO J F, YAN P, et al. Self-learning-based multiple spacecraft evasion decision making simulation under sparse reward condition[J]. Journal of System Simulation202133(8): 1766-1774 (in Chinese).
18 ZHANG H L, LUO J J, GAO Y, et al. An intention inference method for the space non-cooperative target based on BiGRU-Self Attention[J]. Advances in Space Research202372(5): 1815-1828.
19 黎飞, 雷拥军, 冯佳佳. 一种GEO卫星太阳光遮挡轨迹设计与控制方法[J]. 宇航学报202243(2): 198-205.
  LI F, LEI Y J, FENG J J. A design and control method of Sun occlusion trajectory for GEO satellite[J]. Journal of Astronautics202243(2): 198-205 (in Chinese).
20 SILVER D, SCHRITTWIESER J, SIMONYAN K, et al. Mastering the game of Go without human knowledge[J]. Nature2017550: 354-359.
文章导航

/