导航

Acta Aeronautica et Astronautica Sinica ›› 2024, Vol. 45 ›› Issue (14): 629148-629148.doi: 10.7527/S1000-6893.2023.29148

• special column • Previous Articles     Next Articles

Small target detection algorithm for UAV based on patch⁃wise co⁃attention

Aoze YU1,2, Weiwei WEI1,2(), Ping WANG1,2, Jinqiang ZHANG1,3, Wenxiong KE1,4   

  1. 1.Shanghai Radio Equipment Research Institute,Shanghai 201109,China
    2.Shanghai Engineering Research Center of Target Identification and Environment Perception,Shanghai 201109,China
    3.Traffic Perception Radar Technology Research & Development Center of CASC,Shanghai 201109,China
    4.Shanghai Limradar Electronic Technology Co. ,Ltd. ,Shanghai 201109,China
  • Received:2023-06-12 Revised:2023-06-27 Accepted:2023-08-09 Online:2024-07-25 Published:2023-09-13
  • Contact: Weiwei WEI E-mail:wwwei802@163.com
  • Supported by:
    China Aerospace Science and Technology Group Co., Ltd. Major Civil Independent R&D Project(YF-ZZYF-M-2022-019)

Abstract:

To address the problems of insufficient feature extraction and low detection accuracy of conventional target detection algorithms in the small target detection task of UAVs, a small target detection algorithm for UAVs is proposed based on Patch-Wise Co-Attention (PWCA). Firstly, a plug-and-play PWCA is proposed. The input feature is divided into patches in spatial dimension, and channel attention weights are extracted from the patches to enhance the discrimination of channel information in terms of local spatial features, so as to improve the network granularity in small target detection scenarios. Then, the input feature and the focused feature are fused, spatial attention is further extracted, and the effective feature information in the network is paid attention to. Secondly, the step-wise convolutional downsampling in the baseline grid is abandoned, and Adaptive Interlace Downsampling (AID) is proposed in combination with PWCA to distribute the weight of the features after downsampling according to the significance of features, and to reduce the information loss of small targets during the downsampling process. Finally, the lightweight design of the backbone and feature fusion network is carried out to reduce the calculation cost. A large-scale feature map detection branch for small targets is added to optimize the flow direction of feature maps. The semantic information of feature maps in different scales is enriched and the feature expression capability is enhanced to ensure real-time performance. The Soft-NMS algorithm is used to solve the problem of missing detection when the target is occluded and overlapped to improve the performance. The effectiveness of the proposed algorithm is evaluated on the VisDrone2019 dataset. The final mAP0.5 of the improved algorithm is 11.81% better than that of the YOLOv5s baseline algorithm, and the mAP0.5:0.95 is 10.91% better, while the model parameters are reduced by 59%. The proposed network can effectively balance detection accuracy and inference speed for UAV small target detection tasks, demonstrating its practical significance.

Key words: UAV, YOLOv5, small target detection, attention mechanism, downsampling, feature fusion

CLC Number: