航空学报 > 2024, Vol. 45 Issue (14): 629148-629148   doi: 10.7527/S1000-6893.2023.29148

基于分块复合注意力的无人机小目标检测算法

于傲泽1,2, 魏维伟1,2(), 王平1,2, 张金强1,3, 柯文雄1,4   

  1. 1.上海无线电设备研究所,上海 201109
    2.上海目标识别与环境感知工程技术研究中心,上海 201109
    3.中国航天科技集团有限公司交通感知雷达技术研发中心,上海 201109
    4.上海黎明瑞达电子科技有限公司,上海 201109
  • 收稿日期:2023-06-12 修回日期:2023-06-27 接受日期:2023-08-09 出版日期:2024-07-25 发布日期:2023-09-13
  • 通讯作者: 魏维伟 E-mail:wwwei802@163.com
  • 基金资助:
    中国航天科技集团有限公司民用重大自主研发项目(YF-ZZYF-M-2022-019)

Small target detection algorithm for UAV based on patch⁃wise co⁃attention

Aoze YU1,2, Weiwei WEI1,2(), Ping WANG1,2, Jinqiang ZHANG1,3, Wenxiong KE1,4   

  1. 1.Shanghai Radio Equipment Research Institute,Shanghai 201109,China
    2.Shanghai Engineering Research Center of Target Identification and Environment Perception,Shanghai 201109,China
    3.Traffic Perception Radar Technology Research & Development Center of CASC,Shanghai 201109,China
    4.Shanghai Limradar Electronic Technology Co. ,Ltd. ,Shanghai 201109,China
  • Received:2023-06-12 Revised:2023-06-27 Accepted:2023-08-09 Online:2024-07-25 Published:2023-09-13
  • Contact: Weiwei WEI E-mail:wwwei802@163.com
  • Supported by:
    China Aerospace Science and Technology Group Co., Ltd. Major Civil Independent R&D Project(YF-ZZYF-M-2022-019)

摘要:

针对常规目标检测算法在无人机小目标检测任务上特征提取难度大而导致检测精度低的问题,提出一种基于分块复合注意力的无人机小目标检测算法。首先,提出一种即插即用的分块复合注意力模块(PWCA),输入特征在空间维度切分成局部特征块,在局部特征块上提取通道注意力权重,加强通道信息在局部空间特征上的区分度,提高网络细粒度以适应小目标检测场景,然后融合输入特征与聚焦后的特征,并进一步挖掘空间注意力,关注网络中有效特征信息。其次,抛弃基线网络基于跨步卷积的下采样形式,结合PWCA提出自适应交错下采样模块(AID),根据重要程度自适应地分配下采样后的特征权重,减少下采样过程中小目标的信息损失。最后,对主干及特征融合网络进行轻量化设计,减少计算量并新增针对小目标的大尺寸特征图检测分支,优化了特征图的流动方向,丰富不同尺度特征图的语义信息,增强特征的表达能力,并保证实时性。针对性地采用Soft-NMS算法解决目标遮挡重叠时的漏检问题,提升检测效果。在公开数据集VisDrone2019上验证改进算法的有效性,与YOLOv5s目标检测算法相比,改进后算法最终的mAP0.5比YOLOv5s基线算法提升了11.81%,mAP0.5:0.95提升了10.91%,模型参数减少59%,网络在无人机小目标检测任务上能够较好地兼顾检测精度与推理速度,具有较大的实用意义。

关键词: 无人机, YOLOv5, 小目标检测, 注意力机制, 下采样, 特征融合

Abstract:

To address the problems of insufficient feature extraction and low detection accuracy of conventional target detection algorithms in the small target detection task of UAVs, a small target detection algorithm for UAVs is proposed based on Patch-Wise Co-Attention (PWCA). Firstly, a plug-and-play PWCA is proposed. The input feature is divided into patches in spatial dimension, and channel attention weights are extracted from the patches to enhance the discrimination of channel information in terms of local spatial features, so as to improve the network granularity in small target detection scenarios. Then, the input feature and the focused feature are fused, spatial attention is further extracted, and the effective feature information in the network is paid attention to. Secondly, the step-wise convolutional downsampling in the baseline grid is abandoned, and Adaptive Interlace Downsampling (AID) is proposed in combination with PWCA to distribute the weight of the features after downsampling according to the significance of features, and to reduce the information loss of small targets during the downsampling process. Finally, the lightweight design of the backbone and feature fusion network is carried out to reduce the calculation cost. A large-scale feature map detection branch for small targets is added to optimize the flow direction of feature maps. The semantic information of feature maps in different scales is enriched and the feature expression capability is enhanced to ensure real-time performance. The Soft-NMS algorithm is used to solve the problem of missing detection when the target is occluded and overlapped to improve the performance. The effectiveness of the proposed algorithm is evaluated on the VisDrone2019 dataset. The final mAP0.5 of the improved algorithm is 11.81% better than that of the YOLOv5s baseline algorithm, and the mAP0.5:0.95 is 10.91% better, while the model parameters are reduced by 59%. The proposed network can effectively balance detection accuracy and inference speed for UAV small target detection tasks, demonstrating its practical significance.

Key words: UAV, YOLOv5, small target detection, attention mechanism, downsampling, feature fusion

中图分类号: