低空无人机实时目标检测算法

  • 杨永刚 ,
  • 姜文韬 ,
  • 高志云
展开
  • 中国民航大学

收稿日期: 2024-12-06

  修回日期: 2025-03-10

  网络出版日期: 2025-03-19

基金资助

国家自然科学基金;中央高校基本科研业务费;天津市城市空中交通系统技术与装备重点实验室

Algorithm for Real-Time Target Detection in Low-Altitude UAVs

  • YANG Yong-Gang ,
  • JIANG Wen-Tao ,
  • GAO Zhi-Yun
Expand

Received date: 2024-12-06

  Revised date: 2025-03-10

  Online published: 2025-03-19

Supported by

National Natural Science Foundation of China;the Fundamental Research Funds for the Central Universities;Key Laboratory of Technology and Equipment of Tianjin Urban Air Transportation System

摘要

针对低空无人机视角下的目标存在相互遮挡、像素小和复杂背景的问题,本文提出一种用于低空无人机平台的小目标检测算法HPRS-YOLO。在主干网络采用一种新的多尺度空间金字塔(SPMCC),抛弃基于最大池化的下采样形式,利用膨胀卷积动态调整网络的感受野,更有效地绘制检测对象的上下文信息;融合两种Metaformer模型改进C3K2模块,增强小目标结构和纹理特征信息,减少参数量,保持运算开销在较小水平;Dysample优化上采样算子,抑制偏移重叠和边界点值混乱,提高目标与背景的对比度;引入浅层细节处理模块(SDFM)重新设计颈部网络尾端,实现首尾跨尺度特征校准,强调对低层特征图的关注度,补偿小目标特征的缺失以及维护遮挡目标剩余空间信息的完整性。在数据集VisDrone2019上做消融实验和对比实验,相较于基线算法,mAP0.5和mAP0.5:0.95分别提升5个和3个百分点,在公开数据集DOTA上做泛化实验,mAP0.5提升2.0%,证明具有良好的鲁棒性,最后将模型部署到嵌入式设备NVIDIA Jetson AGX Orin上进行验证,FPS达到60,表明HPRS-YOLO通过优化算法设计在保持高准确率的同时,确保实时检测的能力。

本文引用格式

杨永刚 , 姜文韬 , 高志云 . 低空无人机实时目标检测算法[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2025.31619

Abstract

To address the challenges of mutual occlusion, tiny pixels, and complex backgrounds in low-altitude UAV-based object detection, this paper proposes HPRS-YOLO, a small target detection algorithm optimized for UAV platforms. The backbone network incorporates a novel Multi-Scale Spatial Pyramid (SPMCC), which replaces max-pooling-based downsampling with dilated convolution to dynamically adjust the receptive field, thereby enhancing contex-tual feature extraction; The improved C3K2 module integrates two Metaformer architectures to reinforce structural and textural features of small targets while reducing parameters and maintaining low computational overhead; A dynamic upsampling operator (Dysample) is introduced to suppress offset overlaps and boundary pixel value con-fusion, thereby improving target-background contrast; The neck network is redesigned with a Shallow Detail Focus Module (SDFM) to achieve cross-scale feature calibration between terminal layers, emphasizing low-level feature maps to compensate for missing small-target characteristics and preserve spatial integrity of occluded objects. On the dataset VisDrone2019, ablation and comparison experiments are conducted. The results show that mAP0.5 and mAP0.5:0.95 are improved by 5 and 3 percentage points, respectively, when compared to the baseline method. Gener-alization experiments are conducted on the public datasets DOTA, and mAP0.5 is improved by 2.0%, demonstrating good robustness, and finally the model is deployed to the embedded device NVIDIA Jetson AGX Orin for validation, and the FPS is up to 60, demonstrating that HPRS-YOLO guarantees real-time detection capability by optimizing the algorithm design while keeping high accuracy.

参考文献

[1] 王强,吴乐天,王勇,等.基于关键点检测的红外弱小目标检测[J].航空学报, 2023, 44(10): 289-299.
WANG Q, WU L T, WANG Y, et al. An infrared small target detection method based on key point[J]. Acta Aer-onautica et Astronautica Sinica, 2023, 44(10): 289- 299. (in Chinese).
[2] SHIN G, YOOUN H, SHIN D, et al. Incremental learn-ing method for cyber intelligence, surveillance, and re-connaissance in closed military network using converged IT techniques[J]. Soft Computing, 2018, 22(20): 6835-6844.
[3] Ang L ,Shijie S ,Zhaoyang Z , et al.A Multi-Scale Traffic Object Detection Algorithm for Road Scenes Based on Improved YOLOv5[J].Electronics,2023,12(4):878-878.
[4] Lai H, Chen L, Liu W, et al. STC-YOLO: small object detection network for traffic signs in complex environ-ments[J]. Sensors, 2023, 23(11): 5307.
[5] Bhadra S, Sagan V, Sarkar S, et al. PROSAIL-Net: A transfer learning-based dual stream neural network to es-timate leaf chlorophyll and leaf angle of crops from UAV hyperspectral images[J]. ISPRS Journal of Photogram-metry and Remote Sensing, 2024, 210: 1-24.
[6] Martinez-Alpiste I, Golcarenarenji G, Wang Q, et al. Search and rescue operation using UAVs: A case study[J]. Expert Systems with Applications, 2021, 178: 114937.
[7] Duo C, Li Y, Gong W, et al. UAV‐aided distribution line inspection using double‐layer offloading mecha-nism[J]. IET Generation, Transmission & Distribution, 2024.
[8] Dai J, Li Y, He K, et al. R-fcn: Object detection via re-gion-based fully convolutional networks[J]. Advances in neural information processing systems, 2016, 29.
[9] Girshick R. Fast r-cnn[J]. arxiv preprint arxiv:1504.08083, 2015.
[10] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 580-587.
[11] Cai Z, Vasconcelos N. Cascade r-cnn: Delving into high quality object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 6154-6162.
[12] Redmon J. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[13] Jocher G, Stoken A, Borovec J, et al. ultralytics/yolov5: v3. 1-bug fixes and performance improvements[J]. Ze-nodo, 2020.
[14] Varghese R, Sambath M. YOLOv8: A Novel Ob-ject Detection Algorithm with Enhanced Performance and Robustness[C]//2024 International Conference on Advances in Data Engineering and Intelligent Compu-ting Systems (ADICS). IEEE, 2024: 1-6.
[15] Agrawal N. Design Tradeoffs for SSD Perfor-mance[C]//USENIX ATC. 2008.
[16] 冒国韬, 邓天民, 于楠晶. 基于多尺度分割注意力的无人机航拍图像目标检测算法[J]. 航空学报, 2023, 44(5): 273-283.
MAO G T, DENG T M, YU N J. Object detection in UAV images based on multi-scale split attention[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(5): 273-283. (in Chinese).
[17] 罗旭东, 吴一全, 陈金林. 无人机航拍影像目标检测 与语义分割的深度学习方法研究进展[J]. 航空学报, 2024, 45(6): 1-30.
LUO X D, WU Y Q, CHEN J L. Research progress on deep learning methods for object detection and semantic segmentation in UAV aerial images[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(6): 1-30. (in Chinese).
[18] Chen P, Wang J, Zhang Z, et al. CSPGNet: Cross-scale spatial perception guided network for tiny object detec-tion in remote sensing images[J]. Digital Signal Pro-cessing, 2024, 154: 104674.
[19] Luo X, Wu Y, Zhao L. YOLOD: A target detection method for UAV aerial imagery[J]. Remote Sensing, 2022, 14(14): 3240.
[20] Xue C, Xia Y, Wu M, et al. EL-YOLO: An efficient and lightweight low-altitude aerial objects detector for onboard applications[J]. Expert Systems with Applica-tions, 2024, 256: 124848.
[21] Zhang H, Sun W, Sun C, et al. HSP-YOLOv8: UAV Aerial Photography Small Target Detection Algo-rithm[J]. Drones, 2024, 8(9): 453.
[22] Xiao X, Xue X, Zhao Z, et al. A Recursive Prediction-Based Feature Enhancement for Small Object Detec-tion[J]. Sensors, 2024, 24(12): 3856.
[23] Zhao L L, Zhu M L. MS-YOLOv7: YOLOv7 based on multi-scale for object detection on UAV aerial photog-raphy[J]. Drones, 2023, 7(3): 188.
[24] Wang L, Tien A. Aerial image object detection with vi-sion transformer detector (ViTDet)[C]//IGARSS 2023-2023 IEEE International Geoscience and Remote Sens-ing Symposium. IEEE, 2023: 6450-6453.
[25] Vaswani A. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017.
[26] Yu W, Si C, Zhou P, et al. Metaformer baselines for vi-sion[J]. IEEE Transactions on Pattern Analysis and Ma-chine Intelligence, 2023.
[27] Liu W, Lu H, Fu H, et al. Learning to upsample by learn-ing to sample[C]//Proceedings of the IEEE/CVF Interna-tional Conference on Computer Vision. 2023: 6027-6037.
[28] Tang L, Zhang H, Xu H, et al. Rethinking the necessity of image fusion in high-level vision tasks: A practical in-frared and visible image fusion network based on pro-gressive semantic injection and scene fidelity[J]. Infor-mation Fusion, 2023, 99: 101870.
[29] Yu F. Multi-scale context aggregation by dilated convo-lutions[J]. arxiv preprint arxiv:1511.07122, 2015.
[30] Yu W, Luo M, Zhou P, et al. Metaformer is actually what you need for vision[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 10819-10829.
[31] Chollet F. Xception: Deep learning with depthwise sepa-rable convolutions[C]//Proceedings of the IEEE confer-ence on computer vision and pattern recognition. 2017: 1251-1258.
[32] Mamalet F, Garcia C. Simplifying convnets for fast learning[C]//International Conference on Artificial Neu-ral Networks. Berlin, Heidelberg: Springer Berlin Hei-delberg, 2012: 58-65.
[33] Du D, Zhu P, Wen L, et al. VisDrone-DET2019: The vision meets drone object detection in image challenge results[C]//Proceedings of the IEEE/CVF international conference on computer vision workshops. 2019: 0-0.
[34] Zihan L, xu W, Linyun Z, et al. LightYOLO-S: a light-weight algorithm for detecting small targets[J]. Journal of Real-Time Image Processing, 2024, 21(4): 111.
文章导航

/