special column

Small target detection algorithm for UAV based on patch⁃wise co⁃attention

  • Aoze YU ,
  • Weiwei WEI ,
  • Ping WANG ,
  • Jinqiang ZHANG ,
  • Wenxiong KE
Expand
  • 1.Shanghai Radio Equipment Research Institute,Shanghai 201109,China
    2.Shanghai Engineering Research Center of Target Identification and Environment Perception,Shanghai 201109,China
    3.Traffic Perception Radar Technology Research & Development Center of CASC,Shanghai 201109,China
    4.Shanghai Limradar Electronic Technology Co. ,Ltd. ,Shanghai 201109,China
E-mail: wwwei802@163.com

Received date: 2023-06-12

  Revised date: 2023-06-27

  Accepted date: 2023-08-09

  Online published: 2023-09-13

Supported by

China Aerospace Science and Technology Group Co., Ltd. Major Civil Independent R&D Project(YF-ZZYF-M-2022-019)

Abstract

To address the problems of insufficient feature extraction and low detection accuracy of conventional target detection algorithms in the small target detection task of UAVs, a small target detection algorithm for UAVs is proposed based on Patch-Wise Co-Attention (PWCA). Firstly, a plug-and-play PWCA is proposed. The input feature is divided into patches in spatial dimension, and channel attention weights are extracted from the patches to enhance the discrimination of channel information in terms of local spatial features, so as to improve the network granularity in small target detection scenarios. Then, the input feature and the focused feature are fused, spatial attention is further extracted, and the effective feature information in the network is paid attention to. Secondly, the step-wise convolutional downsampling in the baseline grid is abandoned, and Adaptive Interlace Downsampling (AID) is proposed in combination with PWCA to distribute the weight of the features after downsampling according to the significance of features, and to reduce the information loss of small targets during the downsampling process. Finally, the lightweight design of the backbone and feature fusion network is carried out to reduce the calculation cost. A large-scale feature map detection branch for small targets is added to optimize the flow direction of feature maps. The semantic information of feature maps in different scales is enriched and the feature expression capability is enhanced to ensure real-time performance. The Soft-NMS algorithm is used to solve the problem of missing detection when the target is occluded and overlapped to improve the performance. The effectiveness of the proposed algorithm is evaluated on the VisDrone2019 dataset. The final mAP0.5 of the improved algorithm is 11.81% better than that of the YOLOv5s baseline algorithm, and the mAP0.5:0.95 is 10.91% better, while the model parameters are reduced by 59%. The proposed network can effectively balance detection accuracy and inference speed for UAV small target detection tasks, demonstrating its practical significance.

Cite this article

Aoze YU , Weiwei WEI , Ping WANG , Jinqiang ZHANG , Wenxiong KE . Small target detection algorithm for UAV based on patch⁃wise co⁃attention[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2024 , 45(14) : 629148 -629148 . DOI: 10.7527/S1000-6893.2023.29148

References

1 REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence201739(6): 1137-1149.
2 DUAN K W, BAI S, XIE L X, et al. CenterNet: Keypoint triplets for object detection[C]∥ 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2019: 6568-6577.
3 CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]∥ 16th European Conference on Computer Vision. Cham: Springer, 2020: 213-229.
4 REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: Unified, real-time object detection[C]∥ 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 779-788.
5 江波, 屈若锟, 李彦冬, 等. 基于深度学习的无人机航拍目标检测研究综述[J]. 航空学报202142(4): 524519.
  JIANG B, QU R K, LI Y D, et al. Object detection in UAV imagery based on deep learning: Review[J]. Acta Aeronautica et Astronautica Sinica202142(4): 524519 (in Chinese).
6 YIN G X, YU M, WANG M, et al. Research on highway vehicle detection based on faster R-CNN and domain adaptation[J]. Applied Intelligence202252(4): 3483-3498.
7 REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 6517-6525.
8 REDMON J, FARHADI A. YOLOv3: An incremental improvement[DB/OL]. arXiv preprint1804.02767, 2018.
9 BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection[DB/OL]. arXiv preprint2004.10934, 2020.
10 李红光, 于若男, 丁文锐. 基于深度学习的小目标检测研究进展[J]. 航空学报202142(7): 024691.
  LI H G, YU R N, DING W R. Research development of small object traching based on deep learning[J]. Acta Aeronautica et Astronautica Sinica202142(7): 024691 (in Chinese).
11 LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]∥ 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2017: 2999-3007.
12 GAI R L, CHEN N, YUAN H. A detection algorithm for cherry fruits based on the improved YOLO-v4 model[J]. Neural Computing and Applications202335(19): 13895-13906.
13 JIANG Z C, ZHAO L Q, LI S Y, et al. Real-time object detection method based on improved YOLOv4-tiny[DB/OL]. arXiv preprint2011.04244,2020.
14 NEUBECK A, VAN GOOL L. Efficient non-maximum suppression[C]∥ 18th International Conference on Pattern Recognition (ICPR'06). Piscataway: IEEE Press, 2006: 850-855.
15 BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS—improving object detection with one line of code[C]∥ 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2017: 5562-5570.
16 WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]∥ European Conference on Computer Vision. Cham: Springer, 2018: 3-19.
17 SARVAMANGALA D R, KULKARNI R V. Convolutional neural networks in medical image understanding: A survey[J]. Evolutionary Intelligence202215(1): 1-22.
18 SUNKARA R, LUO T. No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects[C]∥ Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Cham: Springer, 2023: 443-459.
19 LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 936-944.
20 ZHU L L, GENG X, LI Z, et al. Improving YOLOv5 with attention mechanism for detecting boulders from planetary images[J]. Remote Sensing202113(18): 3776.
21 DU D W, ZHU P F, WEN L Y, et al. VisDrone-DET2019: The vision meets drone object detection in image challenge results[C]∥ 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Piscataway: IEEE Press,2019: 213-226.
22 ZHU X Z, SU W J, LU L W, et al. Deformable DETR: Deformable transformers for end-to-end object detection[DB/OL]. arXiv preprint2010.04159, 2020.
23 CAI Z W, VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection[C]∥ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 6154-6162.
24 LI Z M, PENG C, YU G, et al. Light-head R-CNN: In defense of two-stage object detector[DB/OL]. arXiv preprint:1711.07264,2017.
Outlines

/