弱小目标检测与跟踪专栏

基于分块复合注意力的无人机小目标检测算法

  • 于傲泽 ,
  • 魏维伟 ,
  • 王平 ,
  • 张金强 ,
  • 柯文雄
展开
  • 1.上海无线电设备研究所,上海 201109
    2.上海目标识别与环境感知工程技术研究中心,上海 201109
    3.中国航天科技集团有限公司交通感知雷达技术研发中心,上海 201109
    4.上海黎明瑞达电子科技有限公司,上海 201109
E-mail: wwwei802@163.com

收稿日期: 2023-06-12

  修回日期: 2023-06-27

  录用日期: 2023-08-09

  网络出版日期: 2023-09-13

基金资助

中国航天科技集团有限公司民用重大自主研发项目(YF-ZZYF-M-2022-019)

Small target detection algorithm for UAV based on patch⁃wise co⁃attention

  • Aoze YU ,
  • Weiwei WEI ,
  • Ping WANG ,
  • Jinqiang ZHANG ,
  • Wenxiong KE
Expand
  • 1.Shanghai Radio Equipment Research Institute,Shanghai 201109,China
    2.Shanghai Engineering Research Center of Target Identification and Environment Perception,Shanghai 201109,China
    3.Traffic Perception Radar Technology Research & Development Center of CASC,Shanghai 201109,China
    4.Shanghai Limradar Electronic Technology Co. ,Ltd. ,Shanghai 201109,China
E-mail: wwwei802@163.com

Received date: 2023-06-12

  Revised date: 2023-06-27

  Accepted date: 2023-08-09

  Online published: 2023-09-13

Supported by

China Aerospace Science and Technology Group Co., Ltd. Major Civil Independent R&D Project(YF-ZZYF-M-2022-019)

摘要

针对常规目标检测算法在无人机小目标检测任务上特征提取难度大而导致检测精度低的问题,提出一种基于分块复合注意力的无人机小目标检测算法。首先,提出一种即插即用的分块复合注意力模块(PWCA),输入特征在空间维度切分成局部特征块,在局部特征块上提取通道注意力权重,加强通道信息在局部空间特征上的区分度,提高网络细粒度以适应小目标检测场景,然后融合输入特征与聚焦后的特征,并进一步挖掘空间注意力,关注网络中有效特征信息。其次,抛弃基线网络基于跨步卷积的下采样形式,结合PWCA提出自适应交错下采样模块(AID),根据重要程度自适应地分配下采样后的特征权重,减少下采样过程中小目标的信息损失。最后,对主干及特征融合网络进行轻量化设计,减少计算量并新增针对小目标的大尺寸特征图检测分支,优化了特征图的流动方向,丰富不同尺度特征图的语义信息,增强特征的表达能力,并保证实时性。针对性地采用Soft-NMS算法解决目标遮挡重叠时的漏检问题,提升检测效果。在公开数据集VisDrone2019上验证改进算法的有效性,与YOLOv5s目标检测算法相比,改进后算法最终的mAP0.5比YOLOv5s基线算法提升了11.81%,mAP0.5:0.95提升了10.91%,模型参数减少59%,网络在无人机小目标检测任务上能够较好地兼顾检测精度与推理速度,具有较大的实用意义。

本文引用格式

于傲泽 , 魏维伟 , 王平 , 张金强 , 柯文雄 . 基于分块复合注意力的无人机小目标检测算法[J]. 航空学报, 2024 , 45(14) : 629148 -629148 . DOI: 10.7527/S1000-6893.2023.29148

Abstract

To address the problems of insufficient feature extraction and low detection accuracy of conventional target detection algorithms in the small target detection task of UAVs, a small target detection algorithm for UAVs is proposed based on Patch-Wise Co-Attention (PWCA). Firstly, a plug-and-play PWCA is proposed. The input feature is divided into patches in spatial dimension, and channel attention weights are extracted from the patches to enhance the discrimination of channel information in terms of local spatial features, so as to improve the network granularity in small target detection scenarios. Then, the input feature and the focused feature are fused, spatial attention is further extracted, and the effective feature information in the network is paid attention to. Secondly, the step-wise convolutional downsampling in the baseline grid is abandoned, and Adaptive Interlace Downsampling (AID) is proposed in combination with PWCA to distribute the weight of the features after downsampling according to the significance of features, and to reduce the information loss of small targets during the downsampling process. Finally, the lightweight design of the backbone and feature fusion network is carried out to reduce the calculation cost. A large-scale feature map detection branch for small targets is added to optimize the flow direction of feature maps. The semantic information of feature maps in different scales is enriched and the feature expression capability is enhanced to ensure real-time performance. The Soft-NMS algorithm is used to solve the problem of missing detection when the target is occluded and overlapped to improve the performance. The effectiveness of the proposed algorithm is evaluated on the VisDrone2019 dataset. The final mAP0.5 of the improved algorithm is 11.81% better than that of the YOLOv5s baseline algorithm, and the mAP0.5:0.95 is 10.91% better, while the model parameters are reduced by 59%. The proposed network can effectively balance detection accuracy and inference speed for UAV small target detection tasks, demonstrating its practical significance.

参考文献

1 REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence201739(6): 1137-1149.
2 DUAN K W, BAI S, XIE L X, et al. CenterNet: Keypoint triplets for object detection[C]∥ 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2019: 6568-6577.
3 CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]∥ 16th European Conference on Computer Vision. Cham: Springer, 2020: 213-229.
4 REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: Unified, real-time object detection[C]∥ 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 779-788.
5 江波, 屈若锟, 李彦冬, 等. 基于深度学习的无人机航拍目标检测研究综述[J]. 航空学报202142(4): 524519.
  JIANG B, QU R K, LI Y D, et al. Object detection in UAV imagery based on deep learning: Review[J]. Acta Aeronautica et Astronautica Sinica202142(4): 524519 (in Chinese).
6 YIN G X, YU M, WANG M, et al. Research on highway vehicle detection based on faster R-CNN and domain adaptation[J]. Applied Intelligence202252(4): 3483-3498.
7 REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 6517-6525.
8 REDMON J, FARHADI A. YOLOv3: An incremental improvement[DB/OL]. arXiv preprint1804.02767, 2018.
9 BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection[DB/OL]. arXiv preprint2004.10934, 2020.
10 李红光, 于若男, 丁文锐. 基于深度学习的小目标检测研究进展[J]. 航空学报202142(7): 024691.
  LI H G, YU R N, DING W R. Research development of small object traching based on deep learning[J]. Acta Aeronautica et Astronautica Sinica202142(7): 024691 (in Chinese).
11 LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]∥ 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2017: 2999-3007.
12 GAI R L, CHEN N, YUAN H. A detection algorithm for cherry fruits based on the improved YOLO-v4 model[J]. Neural Computing and Applications202335(19): 13895-13906.
13 JIANG Z C, ZHAO L Q, LI S Y, et al. Real-time object detection method based on improved YOLOv4-tiny[DB/OL]. arXiv preprint2011.04244,2020.
14 NEUBECK A, VAN GOOL L. Efficient non-maximum suppression[C]∥ 18th International Conference on Pattern Recognition (ICPR'06). Piscataway: IEEE Press, 2006: 850-855.
15 BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS—improving object detection with one line of code[C]∥ 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2017: 5562-5570.
16 WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]∥ European Conference on Computer Vision. Cham: Springer, 2018: 3-19.
17 SARVAMANGALA D R, KULKARNI R V. Convolutional neural networks in medical image understanding: A survey[J]. Evolutionary Intelligence202215(1): 1-22.
18 SUNKARA R, LUO T. No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects[C]∥ Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Cham: Springer, 2023: 443-459.
19 LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 936-944.
20 ZHU L L, GENG X, LI Z, et al. Improving YOLOv5 with attention mechanism for detecting boulders from planetary images[J]. Remote Sensing202113(18): 3776.
21 DU D W, ZHU P F, WEN L Y, et al. VisDrone-DET2019: The vision meets drone object detection in image challenge results[C]∥ 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Piscataway: IEEE Press,2019: 213-226.
22 ZHU X Z, SU W J, LU L W, et al. Deformable DETR: Deformable transformers for end-to-end object detection[DB/OL]. arXiv preprint2010.04159, 2020.
23 CAI Z W, VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection[C]∥ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 6154-6162.
24 LI Z M, PENG C, YU G, et al. Light-head R-CNN: In defense of two-stage object detector[DB/OL]. arXiv preprint:1711.07264,2017.
文章导航

/