ACTA AERONAUTICAET ASTRONAUTICA SINICA >
Real⁃time small target detection networks for UAV remote sensing
Received date: 2024-01-08
Revised date: 2024-01-18
Accepted date: 2024-03-26
Online published: 2024-04-10
Supported by
National Natural Science Foundation of China(52272390);Natural Science Foundation of Heilongjiang Province(YQ2022A009)
Benefiting from deep learning methods, the performance of object detection methods has greatly improved in recent years. However, significant challenges still exist in detecting targets from UAV remote sensing images. For example, the targets in UAV remote sensing images have small resolution and complex background, and the existing algorithms are difficult to meet the requirement for real-timeliness. To overcome these challenges, this paper proposes a Real-Time Small Target Detection (RTSTD) method based on a Multi-scalar & Multi-depth Feature Extraction (MMFE) network, which can efficiently detect small targets from UAV remote sensing images. The proposed RTSTD crops an input image into multiple small-size images, and feeds a portion of these small-size images into the lightweight MMFE network. Therefore, RTSTD has the capability to handle remote sensing images of arbitrary resolutions without losing image features. A more effective output is proposed for the MMFE network: an overlap vector that represents the position and confidence of the target in the input image. To enhance the MMFE network’s ability to distinguish targets from complex backgrounds, the positive and negative samples are redefined. To test the performance of RTSTD, seven datasets are selected and reconstructed from UAV123, DTB70 and AU-AIR, comprising a total of 8,369 UAV remote sensing images involving small target detection in the ground and sea scenarios. The experimental results demonstrate that compared to existing detection methods, the RTSTD method achieves improvements in both accuracy and speed. It achieves an F-Score of 0.90 or above, with a running speed of over 66 frames per second (FPS) using GPU acceleration and over 35 FPS using only CPU.
Yanfang LIU , Jiayu SHE , Qiufan YUAN , Rui ZHOU , Naiming QI . Real⁃time small target detection networks for UAV remote sensing[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2024 , 45(14) : 630119 -630119 . DOI: 10.7527/S1000-6893.2024.30119
1 | XU C, XU M, YIN C J. Optimized multi-UAV cooperative path planning under the complex confrontation environment[J]. Computer Communications, 2020, 162: 196-203. |
2 | PIERUCCI L, BOCCHI L. Improvements of radar clutter classification in air traffic control environment[C]∥ 2007 IEEE International Symposium on Signal Processing and Information Technology. Piscataway: IEEE Press, 2007: 721-724. |
3 | YU X H, GONG Y Q, JIANG N, et al. Scale match for tiny person detection[C]∥ 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). Piscataway: IEEE Press, 2020: 1246-1254. |
4 | CASBEER D W, KINGSTON D B, BEARD R W, et al. Cooperative forest fire surveillance using a team of small unmanned air vehicles[J]. International Journal of Systems Science, 2006, 37(6): 351-360. |
5 | MADEMLIS I, MYGDALIS V, NIKOLAIDIS N, et al. High-level multiple-UAV cinematography tools for covering outdoor events[J]. IEEE Transactions on Broadcasting, 2019, 65(3): 627-635. |
6 | KELLENBERGER B, MARCOS D, TUIA D. Detecting mammals in UAV images: Best practices to address a substantially imbalanced dataset with deep learning[J]. Remote Sensing of Environment, 2018, 216: 139-153. |
7 | ZHANG L D, PENG Z M. Infrared small target detection based on partial sum of the tensor nuclear norm[J]. Remote Sensing, 2019, 11(4): 382. |
8 | LI B Y, XIAO C, WANG L G, et al. Dense nested attention network for infrared small target detection[J]. IEEE Transactions on Image Processing, 2023, 32: 1745-1758. |
9 | NASRABADI N M. DeepTarget: An automatic target recognition using deep convolutional neural networks[J]. IEEE Transactions on Aerospace and Electronic Systems, 2019, 55(6): 2687-2697. |
10 | RAZAVIAN A S, AZIZPOUR H, SULLIVAN J, et al. CNN features off-the-shelf: An astounding baseline for recognition[C]∥ 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE Press, 2014: 512-519. |
11 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥ Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: ACM, 2014: 580-587. |
12 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. |
13 | HE K M, GKIOXARI G, DOLLáR P, et al. Mask R-CNN[C]∥ 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2017: 2980-2988. |
14 | DING J, XUE N, LONG Y, et al. Learning RoI transformer for oriented object detection in aerial images[C]∥ 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2019: 2844-2853. |
15 | XIE X X, CHENG G, WANG J B, et al. Oriented R-CNN for object detection[C]∥ 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2021: 3500-3509. |
16 | XU Y C, FU M T, WANG Q M, et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(4): 1452-1459. |
17 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]∥ 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 779-788. |
18 | REDMON J, FARHADI A. YOLOv3: An incremental improvement[EB/OL]. 2018: arXiv: 1804.02767. |
19 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. 2020: arXiv: . |
20 | TAN L, LV X Y, LIAN X F, et al. YOLOv4_Drone: UAV image target detection based on an improved YOLOv4 algorithm[J]. Computers & Electrical Engineering, 2021, 93: 107261. |
21 | ZHU X K, LYU S C, WANG X, et al. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]∥ 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Piscataway: IEEE Press, 2021: 2778-2788. |
22 | CAO S H, WANG T, LI T, et al. UAV small target detection algorithm based on an improved YOLOv5s model[J]. Journal of Visual Communication and Image Representation, 2023, 97: 103936. |
23 | 奉志强, 谢志军, 包正伟, 等. 基于改进YOLOv5的无人机实时密集小目标检测算法[J]. 航空学报, 2023, 44(7): 327106. |
FENG Z Q, XIE Z J, BAO Z W, et al. Real-time dense small object detection algorithm for UAV based on improved YOLOv5[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(7): 327106 (in Chinese). | |
24 | REIS D, KUPEC J, HONG J, et al. Real-time flying object detection with YOLOv8[J]. ArXiv e-Prints, 2023: arXiv: . |
25 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[C]∥European Conference on Computer Vision. Cham: Springer, 2016: 21-37. |
26 | LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 936-944. |
27 | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]∥2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2017: 2999-3007. |
28 | TIAN Z, SHEN C H, CHEN H, et al. FCOS: Fully convolutional one-stage object detection[C]∥2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2019: 9626-9635. |
29 | YANG F, FAN H, CHU P, et al. Clustered object detection in aerial images[C]∥2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2019: 8310-8319. |
30 | ZHANG P, ZHANG Y E, HUANG Y, et al. SAR fast target imaging in sparse field based on AlexNet[C]∥2021 IEEE Radar Conference (RadarConf21). Piscataway: IEEE Press, 2021: 1-6. |
31 | LIU Z G, LI D Y, GE S S, et al. Small traffic sign detection from large image[J]. Applied Intelligence, 2020, 50(1): 1-13. |
32 | LIU Z G, DU J, TIAN F, et al. MR-CNN: A multi-scale region-based convolutional neural network for small traffic sign recognition[J]. IEEE Access, 2019, 7: 57120-57128. |
33 | DUAN K W, DU D W, QI H G, et al. Detecting small objects using a channel-aware deconvolutional network[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(6): 1639-1652. |
34 | LENG J X, LIU Y, DU D W, et al. Robust obstacle detection and recognition for driver assistance systems[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(4): 1560-1571. |
35 | WILMS C, FRINTROP S. AttentionMask: Attentive, efficient object proposal generation focusing on small objects[C]∥Asian Conference on Computer Vision. Cham: Springer, 2019: 678-694. |
36 | CHEN Z G, WU K H, LI Y B, et al. SSD-MSN: An improved multi-scale object detection network based on SSD[J]. IEEE Access, 2019, 7: 80622-80632. |
37 | LI C L, YANG T, ZHU S J, et al. Density map guided object detection in aerial images[C]∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway: IEEE Press, 2020: 737-746. |
38 | 李红艳, 李春庚, 安居白, 等. 注意力机制改进卷积神经网络的遥感图像目标检测[J]. 中国图象图形学报, 2019, 24(8): 1400-1408. |
LI H Y, LI C G, AN J B, et al. Attention mechanism improves CNN remote sensing image object detection[J]. Journal of Image and Graphics, 2019, 24(8): 1400-1408 (in Chinese). | |
39 | 李子豪, 王正平, 贺云涛. 基于自适应协同注意力机制的航拍密集小目标检测算法[J]. 航空学报, 2023, 44(13): 327944. |
LI Z H, WANG Z P, HE Y T. Aerial-photography dense small target detection algorithm based on adaptive cooperative attention mechanism[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(13): 327944 (in Chinese). | |
40 | WANG J Q, CHEN K, YANG S, et al. Region proposal by guided anchoring[C]∥2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2019: 2960-2969. |
41 | JIANG B R, LUO R X, MAO J Y, et al. Acquisition of localization confidence for accurate object detection[C]∥Computer Vision?ECCV 2018: 15th European Conference, New York: ACM, 2018: 816–832. |
42 | 李红光, 于若男, 丁文锐. 基于深度学习的小目标检测研究进展[J]. 航空学报, 2021, 42(7): 024691. |
LI H G, YU R N, DING W R. Research development of small object traching based on deep learning[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(7): 024691 (in Chinese). | |
43 | ZHAO B Y, WU Y F, GUAN X R, et al. An improved aggregated-mosaic method for the sparse object detection of remote sensing imagery[J]. Remote Sensing, 2021, 13(13): 2602. |
44 | PENG H, TAN X D. Improved YOLOX’s anchor-free SAR image ship target detection[J]. IEEE Access, 2022, 10: 70001-70015. |
45 | LE H, BORJI A. What are the receptive, effective receptive, and projective fields of neurons in convolutional neural networks? [EB/OL]. (2017-05-17). . |
46 | LUO W J, LI Y J, URTASUN R, et al. Understanding the effective receptive field in deep convolutional neural networks[C]∥Proceedings of the 30th International Conference on Neural Information Processing Systems. New York: ACM, 2016: 4905–4913. |
47 | DING X H, ZHANG X Y, HAN J G, et al. Scaling up your kernels to 31 × 31: Revisiting large kernel design in CNNs[C]v2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2022: 11953-11965. |
48 | SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2015: 1-9. |
49 | ZHAO Q J, SHENG T, WANG Y T, et al. M2Det: A single-shot object detector based on multi-level feature pyramid network[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 9259-9266. |
50 | ZHANG G, LI Z Y, LI J M, et al. CFNet: Cascade fusion network for dense prediction[EB/OL]. arXiv preprint: 2302.06052, 2023. |
51 | HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 2261-2269. |
52 | XIA G S, BAI X, DING J, et al. DOTA: A large-scale dataset for object detection in aerial images[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 3974-3983. |
53 | LAM D, KUZMA R, MCGEE K, et al. xView: Objects in context in overhead imagery[DB/OL]. arXiv preprint: 1802.07856, 2018. |
54 | MUELLER M, SMITH N, GHANEM B. A benchmark and simulator for UAV tracking[C]∥European Conference on Computer Vision. Cham: Springer, 2016: 445-461. |
55 | LI S Y, YEUNG D Y. Visual object tracking for unmanned aerial vehicles: A benchmark and new motion models[C]∥Proceedings of the AAAI Conference on Artificial Intelligence, 2017. |
56 | BOZCAN I, KAYACAN E. AU-AIR: A multi-modal unmanned aerial vehicle dataset for low altitude traffic surveillance[C]∥ 2020 IEEE International Conference on Robotics and Automation (ICRA). Piscataway: IEEE Press, 2020: 8504-8510. |
/
〈 |
|
〉 |