导航

Acta Aeronautica et Astronautica Sinica ›› 2023, Vol. 44 ›› Issue (19): 328420-328420.doi: 10.7527/S1000-6893.2023.28420

• Electronics and Electrical Engineering and Control • Previous Articles     Next Articles

A spacecraft rendezvous and docking method based on inverse reinforcement learning

Chenglei YUE1,2, Xuechuan WANG1,2(), Xiaokui YUE1,2, Ting SONG3,4   

  1. 1.National Key Laboratory of Aerospace Flight Dynamics,Northwestern Polytechnical University,Xi’an  710072,China
    2.School of Astronautics,Northwestern Polytechnical University,Xi’an  710072,China
    3.Shanghai Aerospace Control Technology Institute,Shanghai  201109,China
    4.Shanghai Key Laboratory of Space Intelligent Control Technology,Shanghai  201109,China
  • Received:2022-12-22 Revised:2023-01-18 Accepted:2023-05-24 Online:2023-10-15 Published:2023-06-02
  • Contact: Xuechuan WANG E-mail:xcwang@nwpu.edu.cn
  • Supported by:
    National Natural Science Foundation of China(U2013206)

Abstract:

For spacecraft proximity maneuvering and rendezvous, a method for training neural networks based on generative adversarial inverse reinforcement learning is proposed by using model predictive control to provide the expert dataset. Firstly, considering the maximum velocity constraint, the control input saturation constraint and the space cone constraint, the dynamics of the chaser spacecraft approaching a static target is established. Then, the chaser spacecraft is driven to reach the target using model predictive control. Secondly, disturbances are added to the nominal trajectory, and the trajectories from each starting positions to the target are calculated using the aforementioned method. The state and command of trajectories at each time are collected to form a training set. Finally, the network structure and parameters are set, and hyperparameters are trained. Driven by the training set, the adversarial inverse reinforcement learning method is used to train the network. The simulation results show that adversarial inverse reinforcement learning can imitate the behavior of expert trajectories, and successfully train the neural network to drive the spacecraft to move from the starting point to the static target.

Key words: model predictive control, generative adversarial inverse reinforcement learning, imitation learning, network training, neural network

CLC Number: