飞行力学与制导控制

基于指针网络的空间目标遍历交会序列规划

  • 张嘉城 ,
  • 朱阅訸 ,
  • 罗亚中
展开
  • 1.国防科技大学 空天科学学院,长沙  410073
    2.空天任务智能规划与仿真湖南省重点实验室,长沙  410073
.E-mail: luoyz@nudt.edu.cn

收稿日期: 2023-04-11

  修回日期: 2023-04-22

  录用日期: 2023-05-06

  网络出版日期: 2023-05-12

基金资助

国家自然科学基金(12125207)

Space target rendezvous sequence planning via pointer networks

  • Jiacheng ZHANG ,
  • Yuehe ZHU ,
  • Yazhong LUO
Expand
  • 1.College of Aerospace Science,National University of Defense Technology,Changsha  410073,China
    2.Hunan Key Laboratory of Intelligent Planning and Simulation for Aerospace Missions,Changsha  410073,China
E-mail: luoyz@nudt.edu.cn

Received date: 2023-04-11

  Revised date: 2023-04-22

  Accepted date: 2023-05-06

  Online published: 2023-05-12

Supported by

National Natural Science Foundation of China(12125207)

摘要

单航天器对多目标的遍历交会任务规划是一类复杂度极高的混合整数优化问题,涉及顶层交会序列组合优化和底层飞行轨迹连续优化。现有方法将离散变量和连续变量一体优化,计算效率低且难以求得最优序列。提出了一种基于指针网络的多目标遍历交会序列规划方法,可快速获得最优序列。首先,构建了多目标遍历交会序列规划的神经网络模型,作为序列规划的决策智能体。其次,提出了一种基于异步优势函数行动者-评论家算法的无监督学习方法,避免了求解训练标签数据的计算开销。最后,为提高奖励函数的计算效率,在训练中嵌入了一种快速估计实际转移成本的近似方法。应用算例分析表明:所提出的训练方法可显著提高训练效率,经训练的决策智能体能够以超过88.7%的正确率快速求得最优序列。

本文引用格式

张嘉城 , 朱阅訸 , 罗亚中 . 基于指针网络的空间目标遍历交会序列规划[J]. 航空学报, 2023 , 44(15) : 528698 -528698 . DOI: 10.7527/S1000-6893.2023.28698

Abstract

Traversal rendezvous mission planning of multiple space targets for a single spacecraft is a mixed-integer programming problem with high complexity, which involves the combinatorial optimization of the top-level rendezvous sequence and the continuous optimization of the base-level flight trajectories. Existing methods that integrally optimize all discrete and continuous variables are inefficient and difficult to achieve the optimum. We propose a learning-based method that can efficiently obtain the near-optimal sequence mainly using the pointer networks. First, the neural network model for multiple-space-target traversal rendezvous planning is constructed as the decision agent of sequencing. Second, an unsupervised learning method based on the asynchronous advantage actor-critic algorithm is proposed to avoid the expensive computational cost in obtaining training labels. Finally, an estimation method to rapidly approximate the actual transfer cost is embedded in the training process to improve the efficiency of calculating rewards. Case studies show that the proposed training method performs efficiently, and the well-trained agent can rapidly predict the optimal sequence with a probability more than 88.7%.

参考文献

1 SHAN M H, GUO J, GILL E. Review and comparison of active space debris capturing and removal methods[J]. Progress in Aerospace Sciences201680: 18-32.
2 SIZOV D A, ASLANOV V S. Space debris removal with harpoon assistance: Choice of parameters and optimization[J]. Journal of Guidance, Control, and Dynamics202144(4): 767-778.
3 LI Y X, HUO J, MA P, et al. Target localization method of non-cooperative spacecraft on on-orbit service[J]. Chinese Journal of Aeronautics202235(11): 336-348.
4 ZHANG J, PARKS G T, LUO Y Z, et al. Multispacecraft refueling optimization considering the J2 perturbation and window constraints[J]. Journal of Guidance, Control, and Dynamics201437(1): 111-122.
5 GAO Y T, LU X, PENG Y M, et al. Trajectory optimization of multiple asteroids exploration with asteroid 2010TK7 as main target[J]. Advances in Space Research201963(1): 432-442.
6 PELONI A, CERIOTTI M, DACHWALD B. Solar-sail trajectory design for a multiple near-earth-asteroid rendezvous mission[J]. Journal of Guidance, Control, and Dynamics201639(12): 2712-2724.
7 HELVIG C S, ROBINS G, ZELIKOVSKY A. The moving-target traveling salesman problem[J]. Journal of Algorithms200349(1): 153-174.
8 SAAD S, WAN JAAFAR W N, JAMIL S J. Solving standard traveling salesman problem and multiple traveling salesman problem by using branch-and-bound[C]∥ AIP Conference Proceedings. 2013.
9 TOMANOVá P, HOLY V. Ant colony optimization for time-dependent travelling salesman problem[C]∥Proceedings of the 2020 4th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence. New York: ACM, 2020: 47-51.
10 ZHAO J F, FENG W M, YUAN J P. A novel two-level optimization strategy for multi-debris active removal mission in LEO[J]. Computer Modeling in Engineering & Sciences2020122(1): 149-174.
11 朱阅訸. 面向大规模目标访问任务的飞行序列规划方法[D]. 长沙: 国防科技大学, 2020.
  ZHU Y H. Flight sequence planning method for large-scale-object visiting mission[D]. Changsha: National University of Defense Technology, 2020 (in Chinese).
12 SHANG H B, LIU Y X. Assessing accessibility of main-belt asteroids based on Gaussian process regression[J]. Journal of Guidance, Control, and Dynamics201740(5): 1144-1154.
13 HUANG A Y, LUO Y Z, LI H N. Fast estimation of perturbed impulsive rendezvous via semi-analytical equality-constrained optimization[J]. Journal of Guidance, Control, and Dynamics202043(12): 2383-2390.
14 ZHU Y H, LUO Y Z. Fast approximation of optimal perturbed long-duration impulsive transfers via artificial neural networks[J]. IEEE Transactions on Aerospace and Electronic Systems202157(2): 1123-1138.
15 ZHU Y H, LUO Y Z. Fast evaluation of low-thrust transfers via multilayer perceptions[J]. Journal of Guidance, Control, and Dynamics201942(12): 2627-2637.
16 VIAVATTENE G, CERIOTTI M. Artificial neural networks for multiple NEA rendezvous missions with continuous thrust[J]. Journal of Spacecraft and Rockets202259(2): 574-586.
17 CUI P Y, QIAO D, CUI H T, et al. Target selection and transfer trajectories design for exploring asteroid mission[J]. Science China Technological Sciences201053(4): 1150-1158.
18 CERF M. Multiple space debris collecting mission—debris selection and trajectory optimization[J]. Journal of Optimization Theory and Applications2013156(3): 761-796.
19 HUANG A Y, LUO Y Z, LI H N. Global optimization of multiple-spacecraft rendezvous mission via decomposition and dynamics-guide evolution approach[J]. Journal of Guidance, Control, and Dynamics202245(1): 171-178.
20 WANG H J, YANG Z, ZHOU W G, et al. Online scheduling of image satellites based on neural networks and deep reinforcement learning[J]. Chinese Journal of Aeronautics201932(4): 1011-1019.
21 LITTLE B D, FRUEH C E. Space situational awareness sensor tasking: Comparison of machine learning with classical optimization methods[J]. Journal of Guidance, Control, and Dynamics202043(2): 262-273.
22 刘冰雁, 叶雄兵, 周赤非, 等. 基于改进DQN的复合模式在轨服务资源分配[J]. 航空学报202041(5): 323630.
  LIU B Y, YE X B, ZHOU C F, et al. Allocation of composite mode on-orbit service resource based on improved DQN[J]. Acta Aeronautica et Astronautica Sinica202041(5): 323630 (in Chinese).
23 IZZO D, M?RTENS M, PAN B F. A survey on artificial intelligence trends in spacecraft guidance dynamics and control[J]. Astrodynamics20193(4): 287-299.
24 SONG Y, GONG S P. Solar-sail trajectory design for multiple near-Earth asteroid exploration based on deep neural networks[J]. Aerospace Science and Technology201991: 28-40.
25 IZZO D, ?ZTüRK E. Real-time guidance for low-thrust transfers using deep neural networks[J]. Journal of Guidance, Control, and Dynamics202144(2): 315-327.
26 ZAVOLI A, FEDERICI L. Reinforcement learning for robust trajectory design of interplanetary missions[J]. Journal of Guidance, Control, and Dynamics202144(8): 1440-1453.
27 SáNCHEZ-SáNCHEZ C, IZZO D. Real-time optimal control via deep neural networks: Study on landing problems[J]. Journal of Guidance, Control, and Dynamics201841(5): 1122-1135.
28 SCORSOGLIO A, D’AMBROSIO A, GHILARDI L, et al. Image-based deep reinforcement meta-learning for autonomous lunar landing[J]. Journal of Spacecraft and Rockets202259(1): 153-165.
29 YANG B, LI S A, FENG J L, et al. Fast solver for J2-perturbed lambert problem using deep neural network[J]. Journal of Guidance, Control, and Dynamics202245(5): 875-884.
30 PENG H, BAI X L. Artificial neural network–based machine learning approach to improve orbit prediction accuracy[J]. Journal of Spacecraft and Rockets201855(5): 1248-1260.
31 VINYALS O, FORTUNATO M, JAITLY N. Pointer networks[DB/OL]. arXiv preprint: 1506.03134, 2015.
32 GU S S, HAO T, YAO H M. A pointer network based deep learning algorithm for unconstrained binary quadratic programming problem[J]. Neurocomputing2020390: 1-11.
33 GU S S, YAO H M. Pointer network based deep learning algorithm for the maximum clique problem[J]. International Journal on Artificial Intelligence Tools202130(1): 2140004.
34 GU S S, YANG Y E. A deep learning algorithm for the max-cut problem based on pointer network structure with supervised learning and reinforcement learning strategies[J]. Mathematics20208(2): 298.
35 马一凡, 赵凡宇, 王鑫, 等. 基于改进指针网络的卫星对地观测任务规划方法[J]. 浙江大学学报(工学版)202155(2): 395-401.
  MA Y F, ZHAO F Y, WANG X, et al. Satellite earth observation task planning method based on improved pointer networks[J]. Journal of Zhejiang University (Engineering Science)202155(2): 395-401 (in Chinese).
36 HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation19979(8): 1735-1780.
37 KIM Y. Convolutional neural networks for sentence classification[DB/OL]. arXiv preprint: 1408.5882, 2014.
38 NUDT. Problem data of the GTOC11: Candidate asteroids[EB/OL]. .
39 ESA. Problem data of the GTOC9: Debris orbits[EB/OL]. .
40 BANG J, AHN J. Multitarget rendezvous for active debris removal using multiple spacecraft[J]. Journal of Spacecraft and Rockets201956(4): 1237-1247.
文章导航

/