基于多尺度融合的自适应无人机目标跟踪算法
收稿日期: 2021-07-15
修回日期: 2021-08-03
录用日期: 2021-08-23
网络出版日期: 2021-08-25
基金资助
国家自然科学基金(61673017)
Adaptive UAV target tracking algorithm based on multi-scale fusion
Received date: 2021-07-15
Revised date: 2021-08-03
Accepted date: 2021-08-23
Online published: 2021-08-25
Supported by
National Natural Science Foundation of China(61673017)
针对无人机(UAV)跟踪过程中目标的尺寸小、尺度变化大和相似物干扰等问题,提出了一种基于多尺度注意力和特征融合的自适应无人机航拍目标跟踪算法。首先,考虑到无人机视角下干扰信息多,构建了深层多样化特征提取网络,提供鲁棒表征目标的语义特征和多样化特征;其次,设计的多尺度注意力模块,抑制干扰信息的同时保留了不同尺度的目标信息;然后特征融合模块将不同层特征进行融合,有效整合了细节信息和语义信息;最后,使用多个基于无锚框策略的区域建议模块自适应感知目标的尺度变化,充分利用整合的特征信息实现对目标的精准定位与稳定跟踪。实验结果表明:该算法在数据集上的成功率和准确率为61.7%和81.5%,速度为40.5 frame/s。该算法对目标的辨别能力、尺度感知能力和抗干扰能力明显增强,能有效应对无人机跟踪过程中的常见挑战。
薛远亮 , 金国栋 , 谭力宁 , 许剑锟 . 基于多尺度融合的自适应无人机目标跟踪算法[J]. 航空学报, 2023 , 44(1) : 326107 -326107 . DOI: 10.7527/S1000-6893.2021.26107
To overcome the problems of small size, large variation of scale and interference of similar objects in the Unmanned Aerial Vehicle (UAV) tracking process, an adaptive UAV aerial target tracking algorithm is proposed based on multi-scale attention and feature fusion. Firstly, considering the abundant interference information in the UAV view, a deep and diverse feature extraction network is constructed to provide robust characterization of semantic features and diverse features of the target. Secondly, a multi-scale attention module is designed to suppress interference information while retaining target information at different scales. Then, the feature fusion module is used to fuse different layers of features to effectively integrate detailed and semantic information. Finally, multiple region proposal modules based on anchor-free strategy are used to adaptively perceive the scale variation of the target and make full use of the integrated feature information to achieve accurate localization and robust tracking of the target. The experiments show that the success and precision of the algorithm on the dataset are 61.7% and 81.5% with a speed of 40.5 frame/s. The algorithm has significantly enhanced target discrimination, scale perception and anti-interference capability, and can effectively cope with common challenges in the UAV tracking process.
Key words: unmanned aerial vehicle; object tracking; multi-scale; feature fusion; anchor free
1 | 孟琭, 杨旭. 目标跟踪算法综述[J]. 自动化学报, 2019, 45(7): 1244-1260. |
MENG L, YANG X. A survey of object tracking algorithms[J]. Acta Automatica Sinica, 2019, 45(7): 1244-1260 (in Chinese). | |
2 | TAO R, GAVVES E, SMEULDERS A W M. Siamese instance search for tracking[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press,2016:1420-1429. |
3 | BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[M]∥Lecture Notes in Computer Science. Cham: Springer International Publishing, 2016: 850-865. |
4 | HE A F, LUO C, TIAN X M, et al. A twofold Siamese network for real-time object tracking[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press,2018:4834-4843. |
5 | LI B, YAN J J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press,2018:8971-8980. |
6 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. |
7 | FAN H, LING H B. Siamese cascaded region proposal networks for real-time visual tracking[C]∥2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2019:7944-7953. |
8 | LI B, WU W, WANG Q, et al. SiamRPN: evolution of Siamese visual tracking with very deep networks[C]∥2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press,2019:4277-4286. |
9 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press,2016:770-778. |
10 | 刘芳, 孙亚楠, 王洪娟, 等. 基于残差学习的自适应无人机目标跟踪算法[J]. 北京航空航天大学学报, 2020, 46(10): 1874-1882. |
LIU F, SUN Y N, WANG H J, et al. Adaptive UAV target tracking algorithm based on residual learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(10): 1874-1882 (in Chinese). | |
11 | 刘贞报, 马博迪, 高红岗, 等. 基于形态自适应网络的无人机目标跟踪方法[J]. 航空学报, 2021, 42(4): 524904. |
LIU Z B, MA B D, GAO H G, et al. Adaptive morphological network based UAV target tracking algorithm[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(4): 524904 (in Chinese). | |
12 | 刘芳, 杨安喆, 吴志威. 基于自适应Siamese网络的无人机目标跟踪算法[J]. 航空学报, 2020, 41(1): 323423. |
LIU F, YANG A Z, WU Z W. Adaptive Siamese network based UAV target tracking algorithm[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(1): 323423 (in Chinese). | |
13 | LAROCHELLE H, HINTON G. Learning to combine foveal glimpses with a third-order Boltzmann machine[C]∥Proceedings of the 23rd International Conference on Neural Information Processing Systems-Volume 1. New York: ACM, 2010:1243-1251. |
14 | WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018:7794-7803. |
15 | HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. |
16 | Alex K, Ilya S, E H G. ImageNet classification with deep convolutional neural networks[C]∥Proceedings of the 2012 neural information processing systems(NIPS). New York: Curran Associates Inc, 2012. |
17 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[DB/OL]. arXiv preprint: 1409.1556,2014. |
18 | XIE S N, GIRSHICK R, DOLLáR P, et al. Aggregated residual transformations for deep neural networks[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017:5987-5995. |
19 | SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015:1-9. |
20 | DAI Y M, GIESEKE F, OEHMCKE S, et al. Attentional feature fusion[C]∥2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2021:3559-3568. |
21 | YU Y C, XIONG Y L, HUANG W L, et al. Deformable Siamese attention networks for visual object tracking[C]∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020:6727-6736. |
22 | 柏罗, 张宏立, 王聪. 基于高效注意力和上下文感知的目标跟踪算法[J]. 北京航空航天大学学报, 2022, 48(7): 1222-1232. |
BAI L, ZHANG H L, WANG C. Target tracking algorithm based on efficient attention and context awareness[J]. Journal of Beijing University of Aeronautics and Astronautics, 2022, 48(7): 1222-1232 (in Chinese). | |
23 | PFLUGFELDER R. An In-depth analysis of visual tracking with Siamese neural networks[DB/OL]. arXiv preprint: 1707.00569, 2018. |
24 | LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017:936-944. |
25 | HUANG L H, ZHAO X, HUANG K Q. GOT-10k: A large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577. |
26 | CHEN Z D, ZHONG B N, LI G R, et al. Siamese box adaptive network for visual tracking[C]∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020:6667-6676. |
27 | MUELLER M, SMITH N, GHANEM B. A benchmark and simulator for UAV tracking[C]∥Computer Vision-ECCV 2016. Cham: Springer International Publishing, 2016:445-461. |
28 | RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252. |
29 | CHATTOPADHAY A, SARKAR A, HOWLADER P, et al. Grad-CAM: generalized gradient-based visual explanations for deep convolutional networks[C]∥2018 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press,2018:839-847. |
30 | REAL E, SHLENS J, MAZZOCCHI S, et al. YouTube-BoundingBoxes: A large high-precision human-annotated data set for object detection in video[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017:7464-7473. |
31 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]∥Computer Vision-ECCV 2014. Cham: Springer International Publishing, 2014:740-755. |
32 | DANELLJAN M, H?GER G, SHAHBAZ KHAN F, et al. Accurate scale estimation for robust visual tracking[C]∥Proceedings of the British Machine Vision Conference 2014. British Machine Vision Association, 2014:1-11. |
33 | ZHANG J M, MA S G, SCLAROFF S. MEEM: Robust tracking via multiple experts using entropy minimization[M]. Computer Vision-ECCV 2014. Cham: Springer International Publishing, 2014:188-203. |
34 | LI Y, ZHU J K. A scale adaptive kernel correlation filter tracker with feature integration[C]∥Computer Vision-ECCV 2014 Workshops. Cham: Springer International Publishing, 2015:254-265. |
35 | DANELLJAN M, H?GER G, KHAN F S, et al. Learning spatially regularized correlation filters for visual tracking[C]∥2015 IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015:4310-4318. |
36 | HARE S, GOLODETZ S, SAFFARI A, et al. Struck: structured output tracking with kernels[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(10): 2096-2109. |
37 | ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]∥Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018:103-119. |
38 | WU Y, LIM J, YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834-1848. |
/
〈 |
|
〉 |