基于改进YOLOv5的无人机实时密集小目标检测算法

doi:10.7527/S1000-6893.2022.27106

电子电气工程与控制

本期目录 | 过刊浏览 | 高级检索

前一篇 | 后一篇

基于改进YOLOv5的无人机实时密集小目标检测算法

奉志强¹, 谢志军¹(), 包正伟², 陈科伟³

^1.宁波大学信息科学与工程学院，宁波　315211
^2.宁波极望信息科技有限公司，宁波　315000
^3.宁波大学机械工程与力学学院，宁波　315211

收稿日期:2022-03-04 修回日期:2022-03-22 接受日期:2022-04-28 出版日期:2022-05-11 发布日期:2022-05-11
通讯作者: 谢志军 E-mail:xiezhijun@nbu.edu.cn
基金资助:
国家自然科学基金(U20A20121);浙江省自然基金(LY21F020006);宁波市自然科学基金(2019A610088);宁波市“科技创新 2025”重大专项(2019B10125)

Real⁃time dense small object detection algorithm for UAV based on improved YOLOv5

Zhiqiang FENG¹, Zhijun XIE¹(), Zhengwei BAO², Kewei CHEN³

^1.School of Information Science and Engineering，Ningbo University，Ningbo 　315211，China
^2.Ningbo JIWANG Information Technology Ltd，Ningbo 　315000，China
^3.School of Mechanical Engineering and Mechanics，Ningbo University，Ningbo 　315211，China

Received:2022-03-04 Revised:2022-03-22 Accepted:2022-04-28 Online:2022-05-11 Published:2022-05-11
Contact: Zhijun XIE E-mail:xiezhijun@nbu.edu.cn
Supported by:
National Natural Science Foundation of China(U20A20121);Zhejiang Natural Fund Project(LY21F020006);Ningbo Natural Science Foundation Project(2019A610088);Ningbo Key Science and Technology Plan （2025） Project(2019B10125)

摘要/Abstract

摘要：

无人机航拍图像与自然场景图像相比背景更复杂，存在大量密集小目标，对检测网络提出了更高的要求。在保证目标检测实时性的前提下，针对无人机视角下密集小目标检测精度低的问题，提出一种基于YOLOv5的无人机实时密集小目标检测算法。首先，将空间注意力（SAM）与通道注意力（CAM）相结合，改进CAM中特征压缩后的全连接层，降低计算量。另外，改变CAM与SAM的连接结构，提高空间维度特征捕获能力。综上，提出一种空间-通道注意力模块（SCAM），提高模型对特征图中小目标聚集区域的关注程度；其次，提出一种基于SCAM的注意力特征融合模块（SC-AFF），根据不同尺度特征图自适应分配注意力权重，增强小目标的特征融合效率；最后，在主干网络中引入Transformer模块，并利用SC-AFF模块改进原有的残差连接处的特征融合方式，更好地捕获全局信息和丰富的上下文信息提高复杂背景下密集小目标的特征提取能力。在VisDrone2021数据集上进行实验，YOLOv5s基准下，改进后模型的mAP₅₀提高了6.4%，mAP₇₅提高了5.8%，对高分辨率图像的FPS可达到46。在输入分辨率1 504×1 504下训练的模型mAP₅₀可达54.5%，比YOLOv4提高了11.5%，精度提高的同时检测速度FPS依旧保持在46，更适用于密集小目标场景下的无人机实时目标检测。

关键词: 无人机, 小目标检测, 注意力机制, 自注意力机制, 特征融合

Abstract:

UAV aerial images have more complex backgrounds and a large number of dense small targets compared with natural scene images, which impose higher requirements on the detection network. On the premise of ensuring real-time object detection, a YOLOv5-based UAV real-time dense small object detection algorithm is proposed for the problem of low accuracy of dense small object detection in UAV view. First, combining Spatial Attention Module (SAM) with Channel Attention Module (CAM), the fully connected layer after feature compression in CAM is improved to reduce the computational effort. In addition, the connection structure of CAM and SAM is changed to improve the spatial dimensional feature capture capability. In summary, a Spatial-Channel Attention Module (SCAM) is proposed to improve the model's attention to the aggregated regions of small targets in the feature map; secondly, an SCAM- based Attentional Feature Fusion module (SC-AFF) is proposed to enhance the feature fusion efficiency of small targets by adaptively assigning attentional weights according to feature maps of different scales; finally, a backbone network is introduced in the Transformer in the backbone network, and use the SC-AFF to improve the feature fusion at the original residual connections to better capture global information and rich contextual information, and improve the feature extraction capability of dense small targets in complex backgrounds. Experiments are conducted on the VisDrone2021 dataset. The effects of different network scale parameters and different input resolutions on the detection accuracy and speed of YOLOv5 are first investigated. The analysis concludes that YOLOv5s is more suitable to be used as a benchmark model for UAV real-time object detection. Under the benchmark of YOLOv5s, the improved model improves mAP50 by 6.4% and mAP75 by 5.8%, and the FPS for high-resolution images can reach 46. The mAP50 of the model trained at an input resolution of 1504×1504 can reach 54.5%, which is 11.5% better than that of YOLOv4. The accuracy is improved while the detection speed FPS remains at 46, which is more suitable for real-time UAV object detection in dense small target scenarios.

Key words: UAV, small object detection, attention mechanism, self-attention mechanism, feature fusion

中图分类号:

V279

奉志强, 谢志军, 包正伟, 陈科伟. 基于改进YOLOv5的无人机实时密集小目标检测算法[J]. 航空学报, 2023, 44(7): 327106-327106.

Zhiqiang FENG, Zhijun XIE, Zhengwei BAO, Kewei CHEN. Real⁃time dense small object detection algorithm for UAV based on improved YOLOv5[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(7): 327106-327106.

图/表 18

图 1

表 1

图 2

图 3

图 4

图 5

图 6

图 7

图 8

图 9

图 10

表 2

图 11

表 3

图 12

表 4

图 13

图 14

参考文献 27

1	江波，屈若锟，李彦冬，等. 基于深度学习的无人机航拍目标检测研究综述［J］. 航空学报， 2021， 42（4）： 524519.
	JIANG B， QU R K， LI Y D， et al. Object detection in UAV imagery based on deep learning： Review［J］. Acta Aeronautica et Astronautica Sinica， 2021， 42（4）： 524519 （in Chinese）.
2	REN S Q， HE K M， GIRSHICK R， et al. Faster R-CNN： Towards real-time object detection with region proposal networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（6）： 1137-1149.
3	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： Unified， real-time object detection［C］∥2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2016： 779-788.
4	LIU W， ANGUELOV D， ERHAN D， et al. SSD： Single shot MultiBox detector［C］∥European Conference on Computer Vision （ECCV）. Amsterdam： Springer， 2016： 21-37.
5	REDMON J， FARHADI A. YOLO9000： Better， faster， stronger［C］∥2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2017： 6517-6525.
6	REDMON J， FARHADI A. YOLOv3： An incremental improvement［DB/OL］. arXiv preprint： 1804.02767， 2018.
7	BOCHKOVSKIY A， WANG C Y， LIAO H Y M. YOLOv4： Optimal speed and accuracy of object detection［DB/OL］. arXiv preprint： 2004.10934， 2020.
8	李科岑，王晓强，林浩，等. 深度学习中的单阶段小目标检测方法综述［J］. 计算机科学与探索， 2022， 16（1）： 41-58.
	LI K C， WANG X Q， LIN H， et al. Survey of one-stage small object detection methods in deep learning［J］. Journal of Frontiers of Computer Science and Technology， 2022， 16（1）： 41-58 （in Chinese）.
9	WANG Q C， ZHANG H， HONG X G， et al. Small object detection based on modified FSSD and model compression［J］. 2021 IEEE 6th International Conference on Signal and Image Processing （ICSIP）， 2021： 88-92.
10	GONG Y Q， YU X H， DING Y， et al. Effective fusion factor in FPN for tiny object detection［C］∥2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway： IEEE Press， 2021： 1159-1167.
11	LIN T Y， DOLLÁR P， GIRSHICK R， et al. Feature pyramid networks for object detection［C］∥2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2017： 936-944.
12	刘芳，韩笑. 基于多尺度深度学习的自适应航拍目标检测［J］. 航空学报， 2022， 43（5）： 325270.
	LIU F， HAN X. Adaptive aerial object detection based on multi-scale deep learning［J］. Acta Aeronautica et Astronautica Sinica， 2022， 43（5）： 325270 （in Chinese）.
13	WOO S， PARK J， LEE J Y， et al. CBAM： Convolutional block attention module［C］∥Computer Vision – ECCV 2018， 2018.
14	WANG Q L， WU B G， ZHU P F， et al. ECA-net： Efficient channel attention for deep convolutional neural networks［C］∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2020： 11531-11539.
15	LIU S， QI L， QIN H F， et al. Path aggregation network for instance segmentation［C］∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2018： 8759-8768.
16	DAI Y M， GIESEKE F， OEHMCKE S， et al. Attentional feature fusion［C］∥2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway： IEEE Press， 2021： 3559-3568.
17	ZHU L L， GENG X， LI Z， et al. Improving YOLOv5 with attention mechanism for detecting boulders from planetary images［J］. Remote Sensing， 2021， 13（18）： 3776.
18	ZHU X K， LYU S C， WANG X， et al. TPH-YOLOv5： Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios［C］∥2021 IEEE/CVF International Conference on Computer Vision Workshops （ICCVW）. Piscataway： IEEE Press， 2021： 2778-2788.
19	DOSOVITSKIY A， BEYER L， KOLESNIKOV A， et al. An image is worth 16×16 words： Transformers for image recognition at scale［C］∥ International Conference on Learning Representations （ICLR）， 2021.
20	PAN X R， GE C J， LU R， et al. On the integration of self-attention and convolution［C］∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2022： 805-815.
21	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all You need［DB/OL］. arXiv preprint： 1706.03762， 2017.
22	LIN T Y， GOYAL P， GIRSHICK R， et al. Focal loss for dense object detection［C］∥2017 IEEE International Conference on Computer Vision. Piscataway： IEEE Press， 2017： 2999-3007.
23	ZHANG S F， WEN L Y， BIAN X， et al. Single-shot refinement neural network for object detection［C］∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2018： 4203-4212.
24	CAI Z W， VASCONCELOS N. Cascade R-CNN： Delving into high quality object detection［C］∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2018： 6154-6162.
25	LI Z M， PENG C， YU G， et al. Light-head R-CNN： In defense of two-stage object detector［DB/OL］. arXiv preprint： 1711. 07264， 2017.
26	LAW H， DENG J. CornerNet： Detecting objects as paired keypoints［J］. International Journal of Computer Vision， 2020， 128（3）： 642-656.
27	HE K M， ZHANG X Y， REN S Q， et al. Spatial pyramid pooling in deep convolutional networks for visual recognition［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2015， 37（9）： 1904-1916.

E-mail：hkxb@buaa.edu.cn

关于我们

期刊社服务

专业学科

封面文章

友情链接

主管单位：中国科学技术协会主办单位：中国航空学会北京航空航天大学

模型	参数量	计算量/G	mAP₅₀/%
YOLOv5s	7 037 095	15.926 8	33.04
YOLOv5s+CAM	7 099 079	16.067 1	33.21
YOLOv5s+CAM-	7 037 310	15.927 3	33.30

模型	算法	mAP₅₀/%	mAP₇₅/%	mAP_50：95/%	Pre/%	Params/M	GFLOPs	FPS_{1 504}
A	YOLOv5s	33.0	14.8	16.5	45.1	7.037 1	15.9	122
B	YOLOv5s+CBAM_neck	33.6	15.2	16.8	47.3	7.347 0	16.7	82
C	YOLOv5s+SCAM_neck	34.3	16.9	17.7	48.0	7.047 9	15.9	122
D	C+SC-AFF	37.0	18.7	19.5	49.1	7.054 4	16.0	118
E	YOLOv5s+Transformer	36.0	16.3	18.1	48.7	8.429 4	19.1	46
F	YOLOv5s+SC-Transformer	37.2	19.5	20.2	49.3	8.431 6	19.1	46
G	F+SCAM_{backbone&neck}+SC-AFF	39.4	20.6	21.4	50.9	8.457 5	19.2	46

算法	mAP₅₀	Param/M	GFLOPs	Size/MB	FPS_{1 504}
YOLOv5n⁶⁴⁰	27.9	1.777	4.2	14.8	384
YOLOv5n^{1 024}	39.3	1.777	10.8	15.0	384
YOLOv5n^{1 504}	46.7	1.777	23.3	15.3	384
YOLOv5s⁶⁴⁰	33.0	7.037	15.9	57.0	122
YOLOv5s^{1 024}	47.2	7.037	40.8	57.1	122
YOLOv5s^{1 504}	51.9	7.037	88.0	57.5	122
Proposed-s⁶⁴⁰	39.4	8.438	19.2	68.2	46
Proposed-s^{1 024}	48.6	8.438	51.2	69.2	46
Proposed-s^{1 504}	54.5	8.438	109.7	71.0	46
Proposed-m⁶⁴⁰	40.3	25.480	58.3	201.9	31
Proposed-m^{1 024}	50.5	25.480	149.2	202.9	31
Proposed-m^{1 504}	55.6	25.480	321.8	204.7	31

算法	mAP₅₀/%	mAP₇₅/%	mAP_50：95/%	FPS_{1 504}
RetinaNet^［22］	28.7	11.6	11.8
RetfineDet^［23］	28.8	14.1	14.9
Cascade-RCNN^［24］	31.9	15.6	16.1
FPN	32.2	14.9	16.5
Light-RCNN^［25］	32.8	15.1	16.5
Faster-RCNN	33.2	15.2	17.0	15
CornerNet^［26］	34.1	15.9	17.4	33
YOLOv3	41.7	22.9	24.5	31
YOLOv3-SPP	41.9	23.1	25.4	32
YOLOv4	43.0	25.2	24.9	35
YOLOv5-v6.0	44.7	26.8	26.4	35
本文算法	54.5	33.1	32.0	46

[1]	符小卫, 徐哲, 朱金冬, 王楠. 基于PER-MATD3的多无人机攻防对抗机动决策[J]. 航空学报, 2023, 44(7): 327083-327083.
[2]	肖和业, 杨建峰, 白俊强, 张旭东, 吴利荣. 面向任务需求的模块化无人机配置方法[J]. 航空学报, 2023, 44(7): 327100-327100.
[3]	冒国韬, 邓天民, 于楠晶. 基于多尺度分割注意力的无人机航拍图像目标检测算法[J]. 航空学报, 2023, 44(5): 326738-326738.
[4]	贾宝惠, 姜番, 王玉鑫, 王杜. 基于民机维修文本数据的故障诊断方法[J]. 航空学报, 2023, 44(5): 326598-326598.
[5]	许勇, 颜鸿涛, 贾涛, 马跃, 邓泽华, 刘多能. 固定翼集群无人机空中模拟对接技术[J]. 航空学报, 2023, 44(5): 326539-326539.
[6]	薛远亮, 金国栋, 谭力宁, 许剑锟. 基于多尺度融合的自适应无人机目标跟踪算法[J]. 航空学报, 2023, 44(1): 326107-326107.
[7]	郭琪磊, 桑为民, 牛俊杰, 袁烨. 复杂气象条件下考虑结冰风险的无人机飞行策略[J]. 航空学报, 2023, 44(1): 627518-627518.
[8]	刘雷, 刘大卫, 王晓光, 陈俊男, 刘东兴. 无人机集群与反无人机集群发展现状及展望[J]. 航空学报, 2022, 43(S1): 726908-726908.
[9]	杨明月, 寿莹鑫, 唐勇, 刘畅, 许斌. 多四旋翼无人机编队保持与避碰控制[J]. 航空学报, 2022, 43(S1): 726913-726913.
[10]	胡阳修, 赵长春, 贾成龙, 钱洲元, 胡涛. 基于ROS的集群无人机同步路径编队控制[J]. 航空学报, 2022, 43(S1): 726914-726914.
[11]	胡伟, 万文章, 陈谋. 基于神经网络和干扰观测器的UAV自动着舰控制[J]. 航空学报, 2022, 43(S1): 726963-726963.
[12]	苏翎菲, 化永朝, 董希旺, 任章. 人与无人机集群多模态智能交互方法[J]. 航空学报, 2022, 43(S1): 727001-727001.
[13]	孙杨, 昌敏, 白俊强. 微小型四旋翼无人机垂面栖停轨迹规划与控制[J]. 航空学报, 2022, 43(9): 325756-325756.
[14]	赵良玉, 李丹, 赵辰悦, 蒋飞. 无人机自主降落标识检测方法若干研究进展[J]. 航空学报, 2022, 43(9): 25882-025882.
[15]	周伟, 马培洋, 郭正, 王道平, 周睿孙. 基于翼尖链翼的组合固定翼无人机研究[J]. 航空学报, 2022, 43(9): 325946-325946.

基于改进YOLOv5的无人机实时密集小目标检测算法

Real⁃time dense small object detection algorithm for UAV based on improved YOLOv5

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 18

参考文献 27

相关文章 15

编辑推荐

Metrics

本文评价