基于自适应协同注意力机制的航拍密集小目标检测算法
收稿日期: 2022-08-24
修回日期: 2022-09-05
录用日期: 2022-10-20
网络出版日期: 2022-10-26
基金资助
航空科学基金(2020Z005072001)
Aerial-photography dense small target detection algorithm based on adaptive cooperative attention mechanism
Received date: 2022-08-24
Revised date: 2022-09-05
Accepted date: 2022-10-20
Online published: 2022-10-26
Supported by
Aeronautical Science Foundation of China(2020Z005072001)
针对无人机航拍目标检测任务中广视野下目标数量多和小目标占比高的问题,提出一种基于自适应协同注意力机制的无人机航拍目标检测算法ACAM-YOLO,在主干网络与特征增强网络部分嵌入自适应协同注意力机制模块(ACAM),ACAM对输入特征沿通道方向切分后分别挖掘空间注意力特征和通道注意力特征,自适应加权成协同注意力权重,增加对输入特征空间和通道的有效信息利用率;为提升检测精度的同时保障检测网络轻量化,对主干网络、特征增强网络和检测头优化设计,使用轻量化主干网络大幅减少参数量同时使用高分辨率特征增强网络保留更多语义特征与细节特征,通过大尺度检测头中数量多且密集的锚框提升定位精度。使用公开数据集VisDrone2019验证,与基线网络6.0版本的YOLOv5目标检测算法相比,ACAM-YOLO的mAP0.5提升11.0%,mAP0.95提升7.8%,同时模型参数减少65.5%,实验证明ACAM-YOLO目标检测网络针对航拍密集小目标检测具有很强的实用性。
李子豪 , 王正平 , 贺云涛 . 基于自适应协同注意力机制的航拍密集小目标检测算法[J]. 航空学报, 2023 , 44(13) : 327944 -327944 . DOI: 10.7527/S1000-6893.2022.27944
In response to the problem of a large number of targets and a high proportion of small targets in a wide field of view in drone aerial target detection tasks, a drone aerial target detection algorithm ACAM-YOLO based on adaptive collaborative attention mechanism is proposed. In the backbone network and feature enhancement network parts, the Adaptive Co-Attention Module (ACAM) is embedded, which first segments the input features along the channel direction, Then, spatial attention features and channel attention features are separately mined, and finally adaptively weighted into collaborative attention weights to increase the effective utilization of spatial and channel information for input features; To improve detection accuracy while ensuring lightweight of the detection network, the backbone network, feature enhancement network, and detection head are optimized. Firstly, a lightweight backbone network is used to significantly reduce the number of parameters, and then a high-resolution feature enhancement network is used to retain more semantic features and detailed features. Finally, the positioning accuracy is improved by using a large and dense number of anchor boxes in the large-scale detection head. Verified using the public dataset VisDrone2019, compared with the baseline network version 6.0 YOLOv5 object detection algorithm, ACAM-YOLO’s mAP0.5 increased by 11.0%, mAP0.95 increased by 7.8%, and model parameters decreased by 65.5%. The experiment proved that the ACAM-YOLO object detection network has strong practicality for detecting dense small targets in aerial photography.
Key words: computer vision; small object detection; YOLOv5; attention mechanisms; drone
1 | 江波, 屈若锟, 李彦冬, 等. 基于深度学习的无人机航拍目标检测研究综述[J]. 航空学报, 2021, 42(4):524519 |
JIANG B, QU R K, LI Y D, et al. Object detection in UAV imagery based on deep learning: Review[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(4):524519 (in Chinese). | |
2 | REN S, HE K, GIRSHICK,et al. Faster R-CNN: Towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. |
3 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[C]∥European Conference on Computer Vision. Cham: Springer, 2016: 21-37. |
4 | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327. |
5 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 779-788. |
6 | REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 6517-6525. |
7 | REDMON J, FARHADI A. YOLOv3: An incremental improvement[DB/OL]. ArXiv preprint: 1804.02767, 2018. |
8 | BOCHKOVSKIY A, WANG C Y, LIAO H. YOLOv4: Optimal speed and accuracy of object detection[DB/OL]. arXiv preprint: 2004.10934, 2020 |
9 | 张艳, 张明路, 吕晓玲, 等. 深度学习小目标检测算法研究综述[J]. 计算机工程与应用, 2022, 58(15): 1-17. |
ZHANG Y, ZHANG M L, LYU X L, et al. Review of research on small target detection based on deep learning[J]. Computer Engineering and Applications, 2022, 58(15): 1-17 (in Chinese). | |
10 | 李科岑, 王晓强, 林浩, 等. 深度学习中的单阶段小目标检测方法综述[J]. 计算机科学与探索, 2022, 16(1):41-58. |
LI K C, WANG X Q, LIN H, et al. Survey of one-stage small object detection methods in deep learning[J]. Journal of Frontiers of Computer Science & Technology, 2022, 16(1):41-58 (in Chinese). | |
11 | 曹家乐, 李亚利, 孙汉卿, 等. 基于深度学习的视觉目标检测技术综述[J]. 中国图象图形学报, 2022, 27(6):1697-1722. |
CAO J L, LI Y L, SUN H Q, et al. A survey on deep learning based visual object detection[J]. Journal of Image and Graphics, 2022, 27(6):1697-1722 (in Chinese). | |
12 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]∥ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7132-7141. |
13 | WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[DB/OL]. ArXiv preprint: 1807.06521, 2018. |
14 | HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]∥ 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2021: 13708-13717. |
15 | LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 936-944. |
16 | LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]∥ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8759-8768. |
17 | TAN M X, PANG R M, LE Q V. EfficientDet: Scalable and efficient object detection[C]∥ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020: 10778-10787. |
18 | CHEN Y K, ZHANG P Z, LI Z, et al. Stitcher: Feedback-driven data provider for object detection[DB/OL]. arXiv preprint: 2004.12432, 2020. |
19 | YUN S, HAN D, CHUN S, et al. CutMix: Regularization strategy to train strong classifiers with localizable features[C]∥ 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2020: 6022-6031. |
20 | KISANTAL M, WOJNA Z, MURAWSKI J, et al. Augmentation for small object detection[C]∥ 9th International Conference on Advances in Computing and Information Technology (ACITY 2019), 2019. |
21 | DU D W, ZHU P F, WEN L Y, et al. VisDrone-DET2019: The vision meets drone object detection in image challenge results[C]∥ 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Piscataway: IEEE Press, 2020: 213-226. |
22 | 冒国韬, 邓天民, 于楠晶. 基于多尺度分割注意力的无人机航拍图像目标检测算法[J].航空学报, 2023, 44(5): 326738. |
MAO G T, DENG T M, YU N J. Object detection in UAV images based on multi-scale split attetion[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(5): 326738 (in Chinese). |
/
〈 |
|
〉 |