Electronics and Electrical Engineering and Control

Transformer based monocular satellite pose estimation

  • WANG Zi ,
  • SUN Xiaoliang ,
  • LI Zhang ,
  • CHENG Zilong ,
  • YU Qifeng
Expand
  • 1. College of Aerospace Science and Engineering, National University of Defense Technology, Changsha 410073, China;
    2. China Astronaut Research and Training Center, Beijing 100094, China

Received date: 2021-01-21

  Revised date: 2021-02-05

  Online published: 2021-04-27

Supported by

National Natural Science Foundation of China (62003357); Postgraduate Scientific Research Innovation Project of Hunan Province (CX20200024, CX20200025, CX20200088)

Abstract

With the advantages of measurement accuracy and low equipment cost, the satellite pose estimation method based on monocular image has a broad prospect in rendezvous and docking, space attack-defense and other applications. Due to the strong power of feature extraction and representation, the convolutional neural network has achieved significantly better performance than traditional methods in monocular pose estimation. However, the existing methods based on convolutional neural network have some problems, such as inductive bias, indirect description of absolute distance, and lack of long-distance modeling ability. Considering the application requirements of satellite monocular pose estimation, this paper applies the transformer model for satellite pose estimation innovatively to overcome the problems above, and proposes a novel end-to-end satellite monocular pose estimation method. A satellite target representation method is proposed based on the set of key points, and the loss function based on the representation method is established. Then, an end-to-end key point regression network model is developed based on characteristics of the key point regression task, and the backbone network structure for feature extraction is improved. Experimental results on public datasets show that the proposed method can achieve reliable and efficient monocular pose estimation of satellite targets, demonstrating better performance than existing similar methods.

Cite this article

WANG Zi , SUN Xiaoliang , LI Zhang , CHENG Zilong , YU Qifeng . Transformer based monocular satellite pose estimation[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2022 , 43(5) : 325298 -325298 . DOI: 10.7527/S1000-6893.2021.25298

References

[1] 于浛, 魏喜庆, 宋申民, 等. 基于自适应容积卡尔曼滤波的非合作航天器相对运动估计[J]. 航空学报, 2014, 35(8):2251-2260. YUH, WEI X Q, SONG S M, et al. Relative motion estimation of non-cooperative spacecraft based on adaptive CKF[J]. Acta Aeronautica et Astronautica Sinica, 2014, 35(8):2251-2260(in Chinese).
[2] 宗群, 王丹丹, 邵士凯, 等. 多无人机协同编队飞行控制研究现状及发展[J]. 哈尔滨工业大学学报, 2017, 49(3):1-14. ZONG Q, WANG D D, SHAO S K, et al. Research status and development of multi UAV coordinated formation flight control[J]. Journal of Harbin Institute of Technology, 2017, 49(3):1-14(in Chinese).
[3] FORSHAW J L, AGLIETTI G S,NAVARATHINAM N, et al. Remove DEBRIS:An in-orbit active debris removal demonstration mission[J]. Acta Astronautica, 2016, 127:448-463.
[4] KISANTAL M, SHARMA S, PARK T H, et al. Satellite pose estimation challenge:Dataset, competition design, and results[J]. IEEE Transactions on Aerospace and Electronic Systems, 2020, 56(5):4083-4098.
[5] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2):91-110.
[6] BAY H, TUYTELAARS T, GOOL L V. SURF:Speeded up robust features[C]//Proceedings of the 9th European conference on Computer Vision-Volume Part I. Springer-Verlag, 2006.
[7] ANSAR A, DANIILIDIS K.Linear pose estimation from points or lines[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(5):578-589.
[8] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6):84-90.
[9] PROENÇA P F, GAO Y. Deep learning for spacecraft pose estimation from photorealistic rendering[C]//2020 IEEE International Conference on Robotics and Automation. Piscataway:IEEE Press, 2020:6007-6013.
[10] SHARMA S, D'AMICO S. Neural network-based pose estimation for noncooperative spacecraft rendezvous[J]. IEEE Transactions on Aerospace and Electronic Systems, 2020, 56(6):4638-4658.
[11] CHEN B, CAO J W, PARRA A, et al. Satellite pose estimation with deep landmark regression and nonlinear pose refinement[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Piscataway:IEEE Press, 2019:2816-2824.
[12] PARK T H, SHARMA S, D'AMICO S. Towards robust learning-based pose estimation of noncooperativespacecraft[DB/OL]. arXiv preprint:1909.00392, 2019.
[13] LEPETIT V, MORENO-NOGUER F, FUA P.EPnP:An accurate O(n) solution to the PnP problem[J]. International Journal of Computer Vision, 2008, 81(2):155-166.
[14] HUAN W X, LIU M M, HU Q L. Pose estimation for non-cooperative spacecraft based on deep learning[C]//202039th Chinese Control Conference (CCC). Piscataway:IEEE Press, 2020:3339-3343.
[15] GEIRHOS R, RUBISCH P, MICHAELIS C, et al. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy androbustness[DB/OL]. arXiv preprint:1811.12231, 2018.
[16] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems, 2017;30.
[17] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Computer Vision-ECCV 2020, 2020:213-229.
[18] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[19] HE J, ZHAO L N, YANG H W, et al. HSI-BERT:Hyperspectral image classification using the bidirectional encoder representation from transformers[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(1):165-178.
[20] LIU R J, YUAN Z J, LIU T, et al. End-to-end lane shape prediction with transformers[C]//2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway:IEEE Press, 2021:3693-3701.
[21] 于起峰, 尚洋. 摄像测量学原理与应用研究[M]. 北京:科学出版社, 2009:23-32. YU Q F, SHANG Y. Videometrics:Principles and researches[M]. Beijing:Science Press, 2009:23-32(in Chinese).
[22] WOLFE W J, MATHIS D, SKLAIR C W, et al. The perspective view of three points[J]. IEEE Transactions on Pattern Analysisand Machine Intelligence, 1991, 13(1):66-73.
[23] FISCHLER M A, BOLLES R C. A paradigm for model fitting with applications to image analysis and automated cartography[J]. Commun. ACM, 1981, 24(6):381-395.
[24] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016:770-778.
[25] CHEN K, WANG J Q, PANG J M, et al. MMDetection:Open MMLab detection toolbox and benchmark[DB/OL]. ArXiv preprint:1906.07155, 2019.
[26] WANG J D, SUN K, CHENG T H, et al. Deep high-resolution representation learning for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10):3349-3364.
Outlines

/