航空学报 > 2024, Vol. 45 Issue (22): 330248-330248   doi: 10.7527/S1000-6893.2024.30248

基于轻量级神经网络的非合作目标位姿单目测量

王梓1,2, 王靖皓1,2, 李杨1,2, 李璋1,2(), 于起峰1,2   

  1. 1.国防科技大学 空天科学学院,长沙 410073
    2.图像测量与视觉导航湖南省重点实验室,长沙 410073
  • 收稿日期:2024-01-29 修回日期:2024-04-07 接受日期:2024-04-26 出版日期:2024-11-25 发布日期:2024-04-30
  • 通讯作者: 李璋 E-mail:zhangli_nudt@163.com
  • 基金资助:
    国家自然科学基金(12302252);国防科技大学青年自主创新科学基金(ZK24-31)

Non-cooperative target pose estimation from monocular images based on lightweight neural network

Zi WANG1,2, Jinghao WANG1,2, Yang LI1,2, Zhang LI1,2(), Qifeng YU1,2   

  1. 1.College of Aerospace Science and Engineering,National University of Defense Technology,Changsha 410073,China
    2.Hunan Provincial Key Laboratory of Image Measurement and Vision Navigation,Changsha 410073,China
  • Received:2024-01-29 Revised:2024-04-07 Accepted:2024-04-26 Online:2024-11-25 Published:2024-04-30
  • Contact: Zhang LI E-mail:zhangli_nudt@163.com
  • Supported by:
    National Natural Science Foundation of China(12302252);Research Program of National University of Defense Technology(ZK24-31)

摘要:

非合作目标位姿单目测量是未来太空任务的关键技术之一,基于深度神经网络的方法已取得优于传统方法的位姿测量精度。然而,由于在轨条件计算资源受限,已有方法使用的神经网络参数量大、计算复杂度高,不能满足在轨实时测量的需求。当模型参数减小时,神经网络的特征提取表达能力下降,导致位姿测量精度降低,因此通过轻量化神经网络实现高精度位姿测量是一个亟待解决的难题。为此,采用基于语义关键点的技术路线,提出了一种基于轻量化神经网络的非合作目标位姿测量方法。首先设计用于热力图回归的轻量级神经网络,其参数量仅为1.1×106。然后,为提升语义关键点定位和位姿测量的精度,提出一种基于亚像素解码的语义关键点定位方法,同时实现了端到端监督。最后,提出辅助层监督训练方法,以进一步提升语义关键点定位精度。在公开数据集上的实验表明,所提方法在参数量小于107的轻量化模型中,以最少的参数量实现了最高的位姿测量精度。在嵌入式开发板上的实验表明,所提方法在10 W和30 W的功率模式下,分别达到5 Hz和11 Hz的测量频率。

关键词: 位姿测量, 单目视觉, 轻量化, 深度学习, 语义关键点

Abstract:

Estimating the pose of non-cooperative targets from monocular images stands as a pivotal technology for future space missions. Recent advancements in deep neural networks have surpassed traditional methods in pose measurement accuracy. However, these networks often entail a high number of parameters and significant computational complexity. This poses a challenge for deployment in on-orbit applications where real-time measurement is crucial, as computational resources are limited. Reducing the number of network parameters compromises the ability to extract representative features, leading to degraded pose estimation performance. To tackle this problem, we present an approach using lightweight neural networks that maintains high accuracy in pose estimation, which is a task far from trivial. Our solution involves a novel semantic keypoint localization method. We develop a lightweighted neural network model with a mere 1.1 ×106 parameters. To enhance the precision of semantic keypoint localization and subsequent pose estimation, we introduce a heatmap decoding technique that allows for sub-pixel level accuracy, while enabling end-to-end supervision of semantic keypoint localization. Moreover, we develop an auxiliary layer supervised training method to further refine the accuracy of semantic keypoint localization. Experiments on public datasets demonstrate that our method not only achieves the highest pose measurement accuracy among all lightweight models with fewer than 107 parameters, but also sets a new benchmark. Additionally, tests on embedded development boards reveal that our method attains measurement frequencies of 5 Hz and 11 Hz in 10 W and 30 W power modes, respectively.

Key words: pose measurement, monocular image, lightweight, deep learning, semantic keypoint

中图分类号: