Real-time target detection algorithm for low altitude UAVs

Yonggang YANG; Wentao JIANG; Zhiyun GAO

doi:10.7527/S1000-6893.2025.31619

ACTA AERONAUTICAET ASTRONAUTICA SINICA >

2025 , Vol. 46 >Issue 16: 331619 - 331619

DOI: https://doi.org/10.7527/S1000-6893.2025.31619

Electronics and Electrical Engineering and Control

Real-time target detection algorithm for low altitude UAVs

Yonggang YANG ,
Wentao JIANG ,
Zhiyun GAO

Expand

School of Transportation Science and Engineering，Civil Aviation University of China，Tianjin 300300，China

E-mail： zygao@cauc.edu.cn

Received date: 2024-12-06

Revised date: 2024-12-27

Accepted date: 2025-03-05

Online published: 2025-03-19

Supported by

National Natural Science Foundation of China(62403471);The Fundamental Research Funds for the Central Universities(3122023QD18);Key Laboratory of Technology and Equipment of Tianjin Urban Air Transportation System(TJKL-UAM-202402)

Fold

Abstract

To address the challenges of mutual occlusion， tiny pixels， and complex backgrounds in low-altitude UAV-based object detection， this paper proposes HPRS-YOLO， a small target detection algorithm optimized for UAV platforms. The backbone network incorporates a novel Spatial Pyramid Multi-scale Common Convolution （SPMCC）， which replaces max-pooling-based downsampling with dilated convolution to dynamically adjust the receptive field， thereby enhancing contextual feature extraction. The improved C3K2 module integrates two Metaformer architectures to reinforce structural and textural features of small targets while reducing parameters and maintaining low computational overhead. Additionally， a dynamic upsampling operator， Dysample is introduced to suppress offset overlaps and boundary pixel value confusion， thereby improving target-background contrast. The neck network is redesigned with a Shallow Detail Focus Module （SDFM） to achieve cross-scale feature calibration between terminal layers， emphasizing low-level feature maps to compensate for missing small-target characteristics and preserve spatial integrity of occluded objects. On the dataset VisDrone2019， ablation and comparison experiments are conducted. The results show that mAP_0.5 and mAP_0.5∶0.95 are improved by 5% and 3%， respectively， when compared to the baseline method. Generalization experiments are conducted on the public datasets DOTA， and mAP_0.5 is improved by 2.0%， demonstrating good robustness. Finally， the model is deploying the model on an embedded NVIDIA Jetson AGX Orin device achieves an FPS of 60， demonstrating that HPRS-YOLO guarantees real-time detection capability by optimizing the algorithm design while keeping high accuracy.

Key words： low-altitude UAV; small target detection; multi-scale; cross-scale feature calibration; YOLOv11n; Jetson AGX Orin

Cite this article

Yonggang YANG , Wentao JIANG , Zhiyun GAO . Real-time target detection algorithm for low altitude UAVs[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2025 , 46(16) : 331619 -331619 . DOI: 10.7527/S1000-6893.2025.31619

References

[1]	王强，吴乐天，王勇，等. 基于关键点检测的红外弱小目标检测［J］. 航空学报， 2023， 44（10）： 328173.
	WANG Q， WU L T， WANG Y， et al. An infrared small target detection method based on key point［J］. Acta Aeronautica et Astronautica Sinica， 2023， 44（10）： 328173 （in Chinese）.
[2]	SHIN G， YOOUN H， SHIN D， et al. Incremental learning method for cyber intelligence， surveillance， and reconnaissance in closed military network using converged IT techniques［J］. Soft Computing， 2018， 22（20）： 6835-6844.
[3]	LI A， SUN S J， ZHANG Z Y， et al. A multi-scale traffic object detection algorithm for road scenes based on improved YOLOv5［J］. Electronics， 2023， 12（4）： 878.
[4]	BHADRA S， SAGAN V， SARKAR S， et al. PROSAIL-Net： A transfer learning-based dual stream neural network to estimate leaf chlorophyll and leaf angle of crops from UAV hyperspectral images［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2024， 210： 1-24.
[5]	MARTINEZ-ALPISTE I， GOLCARENARENJI G， WANG Q， et al. Search and rescue operation using UAVs： A case study［J］. Expert Systems with Applications， 2021， 178： 114937.
[6]	DUO C H， LI Y Q， GONG W W， et al. UAV-aided distribution line inspection using double-layer offloading mechanism［J］. IET Generation， Transmission & Distribution， 2024， 18（13）： 2353-2372.
[7]	DAI J， LI Y， HE K， et al. R-FCN： Object detection via region-based fully convolutional networks［C］∥Proceedings of the 30th International Conference on Neural Information Processing Systems. New York： Curran Associates Inc， 2016：379-387.
[8]	GIRSHICK R. Fast R-CNN［DB/OL］. arXiv preprint：1504.08083， 2015.
[9]	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： Unified， real-time object detection［C］∥2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2016： 779-788.
[10]	AGRAWAL N， PRABHAKARAN V， WOBBER T， et al. Design tradeoffs for SSD performance［C］∥USENIX 2008 Annual Technical Conference. Berkeley： USENIX Association， 2008： 57-70.
[11]	冒国韬，邓天民，于楠晶. 基于多尺度分割注意力的无人机航拍图像目标检测算法［J］. 航空学报， 2023， 44（5）： 326738.
	MAO G T， DENG T M， YU N J. Object detection in UAV images based on multi-scale split attention［J］. Acta Aeronautica et Astronautica Sinica， 2023， 44（5）： 326738 （in Chinese）.
[12]	罗旭东，吴一全，陈金林. 无人机航拍影像目标检测与语义分割的深度学习方法研究进展［J］. 航空学报， 2024， 45（6）： 028822.
	LUO X D， WU Y Q， CHEN J L. Research progress on deep learning methods for object detection and semantic segmentation in UAV aerial images［J］. Acta Aeronautica et Astronautica Sinica， 2024， 45（6）： 028822 （in Chinese）.
[13]	CHEN P L， WANG J T， ZHANG Z W， et al. CSPGNet： Cross-scale spatial perception guided network for tiny object detection in remote sensing images［J］. Digital Signal Processing， 2024， 154： 104674.
[14]	LUO X D， WU Y Q， ZHAO L Y. YOLOD： A target detection method for UAV aerial imagery［J］. Remote Sensing， 2022， 14（14）： 3240.
[15]	XUE C， XIA Y L， WU M J， et al. EL-YOLO： An efficient and lightweight low-altitude aerial objects detector for onboard applications［J］. Expert Systems with Applications， 2024， 256： 124848.
[16]	ZHANG H， SUN W， SUN C H， et al. HSP-YOLOv8： UAV aerial photography small target detection algorithm［J］. Drones， 2024， 8（9）： 453.
[17]	XIAO X， XUE X R， ZHAO Z Y， et al. A recursive prediction-based feature enhancement for small object detection［J］. Sensors， 2024， 24（12）： 3856.
[18]	ZHAO L L， ZHU M L. MS-YOLOv7： YOLOv7 based on multi-scale for object detection on UAV aerial photography［J］. Drones， 2023， 7（3）： 188.
[19]	WANG L Y， TIEN A. Aerial image object detection with vision transformer detector （ViTDet）［C］∥IGARSS 2023-2023 IEEE International Geoscience and Remote Sensing Symposium. Piscataway： IEEE Press， 2023： 6450-6453.
[20]	YU W H， LUO M， ZHOU P， et al. MetaFormer is actually what you need for vision［C］∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2022： 10809-10819.
[21]	VASWANI A. Attention is all you need［C］∥Proceedings of the 31st International Conference on Neural Information Processing Systems. New York： Curran Associates Inc， 2017： 6000-6010.
[22]	YU W H， SI C Y， ZHOU P， et al. MetaFormer baselines for vision［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2023， 46（2）： 896-912.
[23]	LIU W Z， LU H， FU H T， et al. Learning to upsample by learning to sample［C］∥2023 IEEE/CVF International Conference on Computer Vision （ICCV）. Piscataway： IEEE Press， 2023： 6004-6014.
[24]	TANG L F， ZHANG H， XU H， et al. Rethinking the necessity of image fusion in high-level vision tasks： A practical infrared and visible image fusion network based on progressive semantic injection and scene fidelity［J］. Information Fusion， 2023， 99： 101870.
[25]	YU F， KOLTUN V. Multi-scale context aggregation by dilated convolutions［DB/OL］. arXiv preprint： 1511.07122，2015.
[26]	WANG P Q， CHEN P F， YUAN Y， et al. Understanding convolution for semantic segmentation［C］∥2018 IEEE Winter Conference on Applications of Computer Vision （WACV）. Piscataway： IEEE Press， 2018： 1451-1460.
[27]	ALEXEY D. An image is worth 16×16 words： Transformers for image recognition at scale［DB/OL］. arXiv preprint： 2010.11929， 2020.
[28]	HU J， SHEN L， SUN G. Squeeze-and-excitation networks［C］∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2018： 7132-7141.
[29]	WOO S， PARK J， LEE J Y， et al. CBAM： Convolutional block attention module［C］∥Computer Vision- ECCV 2018. Cham： Springer International Publishing， 2018： 3-19.
[30]	CHOLLET F. Xception： Deep learning with depthwise separable convolutions［C］∥2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2017： 1800-1807.
[31]	MAMALET F， GARCIA C. Simplifying ConvNets for fast learning［C］∥Artificial Neural Networks and Machine Learning-ICANN 2012. Heidelberg： Springer Berlin Heidelberg， 2012： 58-65.
[32]	DU D W， WEN L Y， ZHU P F， et al. VisDrone-DET2020： The vision meets drone object detection in image challenge results［C］∥Computer Vision-ECCV 2020 Workshops. Cham： Springer International Publishing， 2020： 692-712.
[33]	XIA G S， BAI X， DING J， et al. DOTA： A large-scale dataset for object detection in aerial images［C］∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2018： 3974-3983.
[34]	LIU Z H， WU X， ZHANG L Y， et al. LightYOLO-S： A lightweight algorithm for detecting small targets［J］. Journal of Real-Time Image Processing， 2024， 21（4）： 111.
[35]	ZHANG Z X. Drone-YOLO： An efficient neural network method for target detection in drone images?［J］. Drones， 2023， 7（8）： 526.
[36]	FAN Q S， LI Y T， DEVECI M， et al. LUD-YOLO： A novel lightweight object detection network for unmanned aerial vehicle［J］. Information Sciences， 2025， 686： 121366.
[37]	ZHANG Z X. Drone-YOLO： An efficient neural network method for target detection in drone images?［J］. Drones， 2023， 7（8）： 526.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References