航空学报 > 2023, Vol. 44 Issue (13): 327944-327944   doi: 10.7527/S1000-6893.2022.27944

基于自适应协同注意力机制的航拍密集小目标检测算法

李子豪, 王正平, 贺云涛()   

  1. 北京理工大学 宇航学院,北京  100081
  • 收稿日期:2022-08-24 修回日期:2022-09-05 接受日期:2022-10-20 出版日期:2023-07-15 发布日期:2022-10-26
  • 通讯作者: 贺云涛 E-mail:bithyt@bit.edu.cn
  • 基金资助:
    航空科学基金(2020Z005072001)

Aerial-photography dense small target detection algorithm based on adaptive cooperative attention mechanism

Zihao LI, Zhengping WANG, Yuntao HE()   

  1. School of Astronautics,Beijing Institute of Technology,Beijing 100081,China
  • Received:2022-08-24 Revised:2022-09-05 Accepted:2022-10-20 Online:2023-07-15 Published:2022-10-26
  • Contact: Yuntao HE E-mail:bithyt@bit.edu.cn
  • Supported by:
    Aeronautical Science Foundation of China(2020Z005072001)

摘要:

针对无人机航拍目标检测任务中广视野下目标数量多和小目标占比高的问题,提出一种基于自适应协同注意力机制的无人机航拍目标检测算法ACAM-YOLO,在主干网络与特征增强网络部分嵌入自适应协同注意力机制模块(ACAM),ACAM对输入特征沿通道方向切分后分别挖掘空间注意力特征和通道注意力特征,自适应加权成协同注意力权重,增加对输入特征空间和通道的有效信息利用率;为提升检测精度的同时保障检测网络轻量化,对主干网络、特征增强网络和检测头优化设计,使用轻量化主干网络大幅减少参数量同时使用高分辨率特征增强网络保留更多语义特征与细节特征,通过大尺度检测头中数量多且密集的锚框提升定位精度。使用公开数据集VisDrone2019验证,与基线网络6.0版本的YOLOv5目标检测算法相比,ACAM-YOLO的mAP0.5提升11.0%,mAP0.95提升7.8%,同时模型参数减少65.5%,实验证明ACAM-YOLO目标检测网络针对航拍密集小目标检测具有很强的实用性。

关键词: 计算机视觉, 小目标检测, YOLOv5, 注意力机制, 无人机

Abstract:

In response to the problem of a large number of targets and a high proportion of small targets in a wide field of view in drone aerial target detection tasks, a drone aerial target detection algorithm ACAM-YOLO based on adaptive collaborative attention mechanism is proposed. In the backbone network and feature enhancement network parts, the Adaptive Co-Attention Module (ACAM) is embedded, which first segments the input features along the channel direction, Then, spatial attention features and channel attention features are separately mined, and finally adaptively weighted into collaborative attention weights to increase the effective utilization of spatial and channel information for input features; To improve detection accuracy while ensuring lightweight of the detection network, the backbone network, feature enhancement network, and detection head are optimized. Firstly, a lightweight backbone network is used to significantly reduce the number of parameters, and then a high-resolution feature enhancement network is used to retain more semantic features and detailed features. Finally, the positioning accuracy is improved by using a large and dense number of anchor boxes in the large-scale detection head. Verified using the public dataset VisDrone2019, compared with the baseline network version 6.0 YOLOv5 object detection algorithm, ACAM-YOLO’s mAP0.5 increased by 11.0%, mAP0.95 increased by 7.8%, and model parameters decreased by 65.5%. The experiment proved that the ACAM-YOLO object detection network has strong practicality for detecting dense small targets in aerial photography.

Key words: computer vision, small object detection, YOLOv5, attention mechanisms, drone

中图分类号: