Acta Aeronautica et Astronautica Sinica ›› 2025, Vol. 46 ›› Issue (22): 331987.doi: 10.7527/S1000-6893.2025.31987
• Electronics and Electrical Engineering and Control • Previous Articles
Received:2025-03-18
Revised:2025-03-26
Accepted:2025-04-21
Online:2025-04-29
Published:2025-04-25
Contact:
Liping WANG
E-mail:sduwlp@163.com
Supported by:CLC Number:
Shuai ZHONG, Liping WANG. MCS-RETR: Improved RT-DETR object detection method for UAV aerial images[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(22): 331987.
Table 2
Comparison experiments on MCS-DETR network structure
| 模型 | P/% | R/% | mAP0.5/% | mAP0.5∶0.95/% | Params/M | FLOPs/G |
|---|---|---|---|---|---|---|
| YOLOv5m | 46.8 | 35.2 | 33.6 | 19.5 | 25.0 | 64.0 |
| YOLOv8s | 44.8 | 33.6 | 31.6 | 18.0 | 11.1 | 28.5 |
| YOLOv8m | 48.0 | 35.7 | 34.4 | 19.9 | 25.8 | 78.7 |
| YOLOv10n | 39.9 | 29.8 | 27.2 | 14.9 | 2.2 | 6.5 |
| YOLOv10s | 44.7 | 34.4 | 32.4 | 18.2 | 7.2 | 21.4 |
| YOLOv11n | 38.8 | 30.2 | 27.2 | 15.3 | 2.5 | 6.3 |
| YOLOv11s | 44.7 | 34.5 | 32.3 | 18.7 | 9.4 | 21.3 |
| YOLOv11m | 48.1 | 37.8 | 36.0 | 21.4 | 20.0 | 67.7 |
| RTMDet | 48.1 | 34.7 | 35.3 | 21.1 | 52.3 | 80.0 |
| Efficient DETR | 49.5 | 36.1 | 36.7 | 22.0 | 32.0 | 159 |
| RT-DETR-R18 | 54.6 | 38.2 | 37.1 | 21.2 | 19.8 | 57.0 |
| RT-DETR-R34 | 57.2 | 39.6 | 38.6 | 22.5 | 31.1 | 88.8 |
| MCS-DETR | 57.0 | 41.3 | 39.9 | 23.2 | 15.7 | 64.3 |
Table 3
RT-DETR model’s accuracy values for various categories on VisDrone2019-DET-Test dataset
| 类别 | P/% | R/% | mAP50/% | mAP50∶95/% |
|---|---|---|---|---|
| pedestrian | 56.7 | 36.0 | 37.6 | 15.20 |
| people | 57.0 | 24.9 | 27.3 | 9.91 |
| bicycle | 38.7 | 13.6 | 11.7 | 5.18 |
| car | 76.8 | 75.3 | 76.8 | 49.00 |
| van | 53.1 | 41.5 | 37.5 | 26.30 |
| truck | 53.3 | 44.6 | 42.7 | 26.60 |
| tricycle | 33.3 | 28.5 | 21.5 | 12.10 |
| awning-tricycle | 46.3 | 21.3 | 20.2 | 12.40 |
| bus | 77.9 | 52.2 | 56.7 | 39.50 |
| motor | 53.0 | 43.7 | 39.2 | 16.00 |
| all class | 54.6 | 38.2 | 37.1 | 21.20 |
Table 4
MCS-DETR model’s accuracy values for various categories on VisDrone2019-DET-Test dataset
| 类别 | P/% | R/% | mAP50/% | mAP50∶95/% |
|---|---|---|---|---|
| pedestrian | 58.8 | 39.8 | 41.4 | 17.2 |
| people | 56.7 | 27.5 | 29.2 | 10.8 |
| bicycle | 42.8 | 16.7 | 14.2 | 5.94 |
| car | 78.3 | 77.8 | 78.9 | 51.3 |
| van | 55.2 | 42.9 | 38.4 | 27.4 |
| truck | 58.6 | 46.9 | 45.5 | 29.3 |
| tricycle | 36.5 | 36.0 | 27.3 | 15.4 |
| awning-tricycle | 49.5 | 23.9 | 22.1 | 13.8 |
| bus | 77.7 | 54.2 | 58.2 | 41.9 |
| motor | 56.2 | 47.2 | 43.9 | 18.9 |
| all class | 57.0 | 41.3 | 39.9 | 23.2 |
Table 5
Ablation experiments on RT-DETR network structure
| RT-DETR | MSEIE | CATM-AIFI | SOEP | Params/M | FLOPs/G | P/% | R/% | mAP0.5/% | mAP@0.5∶0.95/% | FPS/(帧·s-1) |
|---|---|---|---|---|---|---|---|---|---|---|
| √ | 19 884 600 | 57.0 | 54.6 | 38.2 | 37.1 | 21.2 | 96.5 | |||
| √ | √ | 14 468 248 | 48.4 | 55.8 | 39.6 | 38.5 | 22.5 | 64.3 | ||
| √ | √ | 19 960 888 | 57.1 | 55.1 | 39.1 | 37.9 | 22.1 | 96.9 | ||
| √ | √ | 20 500 536 | 65.2 | 55.3 | 39.4 | 38.1 | 22.0 | 99.2 | ||
| √ | √ | √ | 15 685 528 | 64.2 | 56.2 | 40.9 | 39.2 | 22.6 | 61.3 | |
| √ | √ | √ | √ | 15 761 816 | 64.3 | 57.0 | 41.3 | 39.9 | 23.2 | 61.7 |
| [1] | 王传云, 苏阳, 王琳霖, 等. 面向反制无人机集群的多目标连续鲁棒跟踪算法[J]. 航空学报, 2024, 45(7): 329017. |
| WANG C Y, SU Y, WANG L L, et al. Multi-object continuous robust tracking algorithm for anti-UAV swarm[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(7): 329017 (in Chinese). | |
| [2] | 孟凡腾, 秦勇, 崔京, 等. 铁路外部环境无人机图像未知风险检测方法[J]. 航空学报, 2025, 46(11): 531262. |
| MENG F T, QIN Y, CUI J, et al. Unknown risk detection in external environment of railroad using UAV images[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(11): 531262 (in Chinese). | |
| [3] | 吴一全, 童康. 基于深度学习的无人机航拍图像小目标检测研究进展[J]. 航空学报, 2025, 46(3): 030848. |
| WU Y Q, TONG K. Research advances on deep learning-based small object detection in UAV aerial images[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(3): 030848 (in Chinese). | |
| [4] | LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110. |
| [5] | DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]∥2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). Piscataway: IEEE Press, 2005: 886-893. |
| [6] | FELZENSZWALB P F, GIRSHICK R B, MCALLESTER D, et al. Object detection with discriminatively trained part-based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627-1645. |
| [7] | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. |
| [8] | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327. |
| [9] | KHANAM R, HUSSAIN M. What is YOLOv5: A deep look into the internal features of the popular object detector [DB/OL]. arXiv preprint: 2407.20892, 2024. |
| [10] | VARGHESE R, M S. YOLOv8: A novel object detection algorithm with enhanced performance and robustness[C]∥2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS). Piscataway: IEEE Press, 2024: 1-6. |
| [11] | WANG A, CHEN H, LIU L, et al. Yolov10: Real time end-to-end object detection[DB/OL]. arXiv preprint: 2405.14458, 2024. |
| [12] | KHANAM R, HUSSAIN M. Yolov11: An overview of the key architectural enhancements[DB/OL]. arXiv preprint: 2410.17725, 2024. |
| [13] | ZHAO Y A, LV W Y, XU S L, et al. DETRs beat YOLOs on real-time object detection[C]∥2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 16965-16974. |
| [14] | CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]∥Computer Vision-ECCV 2020. Cham: Springer, 2020: 213-229. |
| [15] | 钟帅, 王丽萍. 无人机航拍图像目标检测技术研究综述[J]. 激光与光电子学进展, 2025, 62(10): 71-89. |
| ZHONG S, WANG L P. Review of Research on Object Detection in UAV Aerial Images[J]. Laser & Optoelectronics Progress, 2025, 62(10): 71-89 (in Chinese). | |
| [16] | SUN F X, HE N, LI R J, et al. GD-PAN: A multiscale fusion architecture applied to object detection in UAV aerial images[J]. Multimedia Systems, 2024, 30(3): 143. |
| [17] | LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8759-8768. |
| [18] | 何戚天, 李为相, 程明, 等. 面向航拍图像的轻量化目标检测算法[J]. 电光与控制, 2025, 32(3): 56-61, 81. |
| HE Q T, LI W X, CHENG M, et al. A lightweight target detection algorithm for aerial images[J]. Electronics Optics & Control, 2025, 32(3): 56-61, 81 (in Chinese). | |
| [19] | CHEN J R, KAO S H, HE H, et al. Run, don’t walk: Chasing higher FLOPS for faster neural networks[C]∥ 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2023: 12021-12031. |
| [20] | WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]∥Computer Vision-ECCV 2018. Cham: Springer, 2018: 3-19. |
| [21] | 李子豪, 王正平, 贺云涛. 基于自适应协同注意力机制的航拍密集小目标检测算法[J]. 航空学报, 2023, 44(13): 327944. |
| LI Z H, WANG Z P, HE Y T. Aerial-photography dense small target detection algorithm based on adaptive cooperative attention mechanism[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(13): 327944 (in Chinese). | |
| [22] | LI Z X, HE Q H, REN L F, et al. PCAF: UAV scenarios detector via pyramid converge-and-assign fusion network[J]. Multimedia Systems, 2025, 31(1): 25. |
| [23] | 王朝辉, 严一鸣, 韩晓微, 等. 基于改进SRGAN的无人机航拍图像去雾算法[J]. 激光与红外, 2024, 54(6): 991-997. |
| WANG Z H, YAN Y M, HAN X W, et al. Improved SRGAN-based algorithm for defogging UAV aerial images[J]. Laser & Infrared, 2024, 54(6): 991-997 (in Chinese). | |
| [24] | LEDIG C, THEIS L, HUSZÁR F, et al. Photo-realistic single image super-resolution using a generative adversarial network[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 105-114. |
| [25] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 770-778. |
| [26] | WANG C Y, MARK LIAO H Y, WU Y H, et al. CSPNet: A new backbone that can enhance learning capability of CNN[C]∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway: IEEE Press, 2020: 1571-1580. |
| [27] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]∥Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010. |
| [28] | ZHANG T, LI L, ZHOU Y, et al. CAS-ViT: Convolutional additive self-attention vision transformers for efficient mobile applications[DB/OL]. arXiv preprint: 2408.03703, 2024. |
| [29] | MEHTA S, RASTEGARI M. Separable self-attention for mobile vision transformers [DB/OL]. arXiv preprint: 2206.02680, 2022. |
| [30] | SHAKER A, MAAZ M, RASHEED H, et al. Swiftformer: Efficient additive attention for transformer-based real time mobile vision applications[DB/OL]. arXiv preprint: 2303.15446, 2023. |
| [31] | SUNKARA R, LUO T. No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects[DB/OL]. arXiv preprint: 2208.03641, 2022. |
| [32] | CUI Y N, REN W Q, KNOLL A. Omni-kernel modulation for universal image restoration[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34(12): 12496-12509. |
| [33] | DU D W, ZHU P F, WEN L Y, et al. VisDrone-DET2019: The vision meets drone object detection in image challenge results[C]∥2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW) Piscataway: IEEE Press, 2019: 213-226. |
| [34] | WANG S, JIANG H P, YANG J X, et al. AMFEF-DETR: An end-to-end adaptive multi-scale feature extraction and fusion object detection network based on UAV aerial images[J]. Drones, 2024, 8(10): 523. |
| [1] | Yiquan WU, Kang TONG. Research advances on deep learning-based small object detection in UAV aerial images [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(3): 30848-030848. |
| [2] | Yi ZHENG, Xianghong CHENG, Xingbang TANG, Yi CAO. Oriented detection algorithm for insulator and their defects from aerial images based on improved ReDet [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(18): 331825-331825. |
| [3] | Fanteng MENG, Yong QIN, Jing CUI, Yunpeng WU, Zicheng ZHANG, Shaowei WEI. Unknown risk detection in external environment of railroad using UAV images [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(11): 531262-531262. |
| [4] | Shusheng CHEN, Muliang JIA, Jiahao LIN, Shiyi JIN, Zhenghong GAO, Yueqing WANG, Zhiqiang MA, Zheng LI, Chenlong DUAN, Jiawei LI. Empowering aircraft technology applications with generative models: Research progress and prospects [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(10): 631194-631194. |
| [5] | Xudong LUO, Yiquan WU, Jinlin CHEN. Research progress on deep learning methods for object detection and semantic segmentation in UAV aerial images [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(6): 28822-028822. |
| [6] | Junyu LI, Qiankun LIU, Ying FU. Infrared small object detection based on attention mechanism [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(14): 628959-628959. |
| [7] | Jiqiang GAN, Xiaoping WANG. Surface defect detection of fiber placement based on virtual sample generation [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(1): 428624-428624. |
| [8] | Zhiqiang FENG, Zhijun XIE, Zhengwei BAO, Kewei CHEN. Real⁃time dense small object detection algorithm for UAV based on improved YOLOv5 [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(7): 327106-327106. |
| [9] | Guotao MAO, Tianmin DENG, Nanjing YU. Object detection in UAV images based on multi-scale split attention [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(5): 326738-326738. |
| [10] | Yubin YUAN, Yiquan WU, Langyue ZHAO, Jinlin CHEN, Qichang ZHAO. Research progress of UAV aerial video multi⁃object detection and tracking based on deep learning [J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(18): 28334-028334. |
| [11] | Yucun SONG, Quanbo GE, Junlong ZHU, Zhenyu LU. Improved YOLOX object detection algorithm based on gradient difference adaptive learning rate optimization [J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(14): 327951-327951. |
| [12] | Zihao LI, Zhengping WANG, Yuntao HE. Aerial-photography dense small target detection algorithm based on adaptive cooperative attention mechanism [J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(13): 327944-327944. |
| [13] | LIU Fang, HAN Xiao. Adaptive aerial object detection based on multi-scale deep learning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2022, 43(5): 325270-325270. |
| [14] | WANG Hui, JIA Zikai, JIN Ren, LIN Defu, FAN Junfang, XU Chao. Cooperative object detection in UAV-based vision-guided docking [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2022, 43(1): 324854-324854. |
| [15] | LI Hongguang, YU Ruonan, DING Wenrui. Research development of small object traching based on deep learning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(7): 24691-024691. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||
Address: No.238, Baiyan Buiding, Beisihuan Zhonglu Road, Haidian District, Beijing, China
Postal code : 100083
E-mail:hkxb@buaa.edu.cn
Total visits: 6658907 Today visits: 1341All copyright © editorial office of Chinese Journal of Aeronautics
All copyright © editorial office of Chinese Journal of Aeronautics
Total visits: 6658907 Today visits: 1341


