Acta Aeronautica et Astronautica Sinica ›› 2025, Vol. 46 ›› Issue (23): 631952.doi: 10.7527/S1000-6893.2025.31952
• special column • Previous Articles
Wei HUANG, Jiahao PAN, Chu HE(
)
Received:2025-03-10
Revised:2025-03-28
Accepted:2025-04-25
Online:2025-05-23
Published:2025-05-08
Contact:
Chu HE
E-mail:chuhe@whu.edu.cn
Supported by:CLC Number:
Wei HUANG, Jiahao PAN, Chu HE. Wavelet time-frequency localization-based model compression for UAV object detection[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 631952.
Table 2
Ablation experiment of lightweight modules
| 模型 | 算法 | Pre/% | Rec/% | mAP50/% | mAP50∶95/% | Params/M | GFLOPs |
|---|---|---|---|---|---|---|---|
| A | YOLOv8s | 52.4 | 41.4 | 40.8 | 24.3 | 11.14 | 28.70 |
| B | YOLOv8s+MV2Block | 47.2 | 37.5 | 35.8 | 20.4 | 9.51 | 20.93 |
| C | YOLOv8s+SV2Block | 46.9 | 37.1 | 35.5 | 20.1 | 7.92 | 18.65 |
| D | YOLOv8s+GhostConv | 46.2 | 36.7 | 35.1 | 19.4 | 5.93 | 16.27 |
| E | D+W-DWConv | 48.4 | 38.8 | 37.6 | 22.0 | 5.93 | 16.30 |
| F | D+W-PWConv | 48.0 | 38.4 | 36.7 | 21.3 | 6.04 | 16.47 |
| G | D+W-DSConv | 50.8 | 38.8 | 38.4 | 22.4 | 6.04 | 16.49 |
Table 3
Ablation experiment of channel wavelet packet transform level N
| N | 输入分辨率:1 024×1 024 | 输入分辨率:640×640 | Params/M | GFLOPs | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Pre/% | Rec/% | mAP50/% | mAP50∶95/% | Pre/% | Rec/% | mAP50/% | mAP50∶95/% | |||
| 46.2 | 36.7 | 35.1 | 19.4 | 40.6 | 29.9 | 28.5 | 15.7 | 5.93 | 16.27 | |
| 1 | 48.3 | 37.9 | 37.1 | 21.7 | 41.9 | 32.1 | 29.8 | 17.0 | 5.93 | 16.28 |
| 2 | 48.0 | 38.7 | 37.4 | 21.9 | 42.3 | 31.9 | 30.1 | 17.2 | 5.93 | 16.29 |
| 3 | 48.4 | 38.8 | 37.6 | 22.0 | 42.9 | 32.5 | 30.4 | 17.4 | 5.93 | 16.30 |
| 4 | 48.2 | 38.4 | 37.3 | 21.8 | 42.4 | 32.2 | 30.2 | 17.2 | 5.93 | 16.31 |
| 5 | 47.9 | 37.7 | 36.9 | 21.6 | 42.1 | 32.0 | 29.9 | 17.1 | 5.93 | 16.32 |
| 6 | 47.6 | 37.4 | 36.6 | 21.3 | 41.8 | 31.8 | 29.7 | 16.9 | 5.93 | 16.34 |
Table 4
Ablation experiment of multi-scale wavelet transform level M
| M | 输入分辨率: 1 024×1 024 | 输入分辨率: 640×640 | Params/M | GFLOPs | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Pre/% | Rec/% | mAP50/% | mAP50∶95/% | Pre/% | Rec/% | mAP50/% | mAP50∶95/% | |||
| 46.2 | 36.7 | 35.1 | 19.4 | 40.2 | 31.2 | 28.5 | 15.4 | 5.93 | 16.27 | |
| 1 | 47.0 | 37.6 | 36.2 | 20.8 | 43.1 | 31.7 | 29.8 | 16.7 | 6.04 | 16.45 |
| 2 | 48.0 | 38.4 | 36.7 | 21.3 | 43.6 | 32.0 | 30.3 | 17.2 | 6.04 | 16.49 |
| 3 | 47.5 | 38.1 | 36.4 | 21.1 | 43.2 | 31.4 | 30.1 | 17.1 | 6.04 | 16.49 |
| 4 | 47.9 | 37.8 | 36.6 | 21.2 | 41.7 | 31.9 | 30.0 | 16.9 | 6.04 | 16.49 |
| 5 | 47.6 | 37.9 | 36.3 | 21.0 | 41.4 | 31.7 | 29.7 | 16.5 | 6.04 | 16.49 |
| 6 | 47.3 | 37.4 | 36.1 | 20.6 | 41.0 | 31.3 | 29.2 | 16.1 | 6.04 | 16.50 |
Table 5
Effect of different model scale during training
| 原始模型 | mAP50/% | mAP50∶95/% | Params/M | GFLOPs | 轻量化模型 | mAP50/% | mAP50∶95/% | Params/M | GFLOPs |
|---|---|---|---|---|---|---|---|---|---|
| YOLOv8n | 36.1 | 21.1 | 3.01 | 8.20 | WaveYOLOv8n | 32.4 | 18.8 | 1.72 | 5.23 |
| YOLOv8n-p2 | 38.1 | 22.4 | 2.93 | 12.38 | WaveYOLOv8n-p2 | 35.6 | 20.7 | 1.61 | 8.86 |
| YOLOv8n-p6 | 36.5 | 21.5 | 4.79 | 8.19 | WaveYOLOv8n-p6 | 33.4 | 19.3 | 2.70 | 5.24 |
| YOLOv8s | 40.8 | 24.3 | 11.14 | 28.70 | WaveYOLOv8s | 38.4 | 22.4 | 6.04 | 16.49 |
| YOLOv8s-p2 | 42.3 | 25.2 | 10.64 | 36.97 | WaveYOLOv8s-p2 | 39.8 | 23.3 | 5.43 | 22.53 |
| YOLOv8s-p6 | 41.0 | 24.4 | 17.88 | 28.58 | WaveYOLOv8s-p6 | 39.5 | 23.0 | 9.58 | 16.48 |
| YOLOv5s | 39.7 | 23.5 | 9.13 | 24.06 | WaveYOLOv5s | 38.6 | 22.4 | 5.98 | 16.58 |
| NanoDet | 28.5 | 15.6 | 0.94 | 3.61 | WaveNanoDet | 29.7 | 16.3 | 0.95 | 3.97 |
Table 6
Experiment of wavelet frequency division quantization
| 模型 | 比特设置 | mAP50/% | mAP50∶95/% | Params/M | Sizes/MB | GFLOPs |
|---|---|---|---|---|---|---|
| YOLOv8s | FP32 | 40.8 | 24.3 | 11.14 | 44.56 | 28.70 |
| WaveYOLOv8s | FP32 | 38.4 | 22.4 | 6.04 | 24.16 | 16.49 |
| W6A6 | 36.8 | 21.7 | 6.05 | 4.53 | 3.09 | |
| W4A4 | 35.9 | 20.6 | 6.05 | 3.02 | 2.06 | |
| W4A3 | 34.3 | 19.4 | 6.05 | 3.02 | 1.80 | |
| YOLOv8s-p2 | FP32 | 42.3 | 25.2 | 10.64 | 42.56 | 36.97 |
| WaveYOLOv8s-p2 | FP32 | 39.8 | 23.3 | 5.43 | 21.72 | 22.53 |
| W6A6 | 38.1 | 22.5 | 5.44 | 4.07 | 4.22 | |
| W4A4 | 37.7 | 21.8 | 5.44 | 2.72 | 2.82 | |
| W4A3 | 36.2 | 20.1 | 5.44 | 2.72 | 2.46 | |
| YOLOv8s-p6 | FP32 | 41.0 | 24.4 | 17.88 | 71.52 | 28.58 |
| WaveYOLOv8s-p6 | FP32 | 39.5 | 23.0 | 9.58 | 38.32 | 16.48 |
| W6A6 | 37.6 | 21.8 | 9.59 | 7.19 | 3.09 | |
| W4A4 | 36.4 | 21.0 | 9.59 | 4.79 | 2.06 | |
| W4A3 | 34.3 | 19.5 | 9.59 | 4.79 | 1.80 |
| [1] | 江波, 屈若锟, 李彦冬, 等. 基于深度学习的无人机航拍目标检测研究综述[J]. 航空学报, 2021, 42(4): 524519. |
| JIANG B, QU R K, LI Y D, et al. Object detection in UAV imagery based on deep learning: Review[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(4): 524519 (in Chinese). | |
| [2] | 欧阳权, 张怡, 马延, 等. 基于深度学习的无人机航拍目标检测与跟踪方法综述[J]. 电光与控制, 2024, 31(3): 1-7. |
| OUYANG Q, ZHANG Y, MA Y, et al. A review of UAV aerial photography target detection and tracking methods based on deep learning[J]. Electronics Optics & Control, 2024, 31(3): 1-7 (in Chinese). | |
| [3] | 赵禄达, 胡以华, 赵楠翔, 等. LiDAR点云深度学习模型的压缩和部署加速方法 研究现状与展望(特邀)[J]. 激光与光电子学进展, 2024, 61(20): 2011005. |
| ZHAO L D, HU Y H, ZHAO N X, et al. Review of model compression and accelerated development for deep learning in LiDAR point cloud processing (Invited)[J]. Laser & Optoelectronics Progress, 2024, 61(20): 2011005 (in Chinese). | |
| [4] | CHEN F H, LI S L, HAN J L, et al. Review of lightweight deep convolutional neural networks[J]. Archives of Computational Methods in Engineering, 2024, 31(4): 1915-1937. |
| [5] | 王军, 冯孙铖, 程勇. 深度学习的轻量化神经网络结构研究综述[J]. 计算机工程, 2021, 47(8): 1-13. |
| WANG J, FENG S C, CHENG Y. Survey of research on lightweight neural network structures for deep learning[J]. Computer Engineering, 2021, 47(8): 1-13 (in Chinese). | |
| [6] | SIFRE L, MALLAT S. Rigid-motion scattering for texture classification[DB/OL]. arXiv preprint: 1403.1687, 2014. |
| [7] | HOWARD A G, ZHU M L, CHEN B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[DB/OL]. arXiv preprint: 1704.04861, 2017. |
| [8] | SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]∥ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4510-4520. |
| [9] | ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: An extremely efficient convolutional neural network for mobile devices[C]∥ 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 6848-6856. |
| [10] | HAN K, WANG Y H, TIAN Q, et al. GhostNet: More features from cheap operations[C]∥ 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020: 1577-1586. |
| [11] | VASU P K A, GABRIEL J, ZHU J, et al. MobileOne:An improved one millisecond mobile backbone[C]∥2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2023: 7907-7917. |
| [12] | HAN S, MAO H Z, DALLY W J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding[DB/OL]. arXiv preprint: 1510.00149, 2015. |
| [13] | LIU X C, YE M, ZHOU D Y, et al. Post-training quantization with multiple points: Mixed precision without mixed precision[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(10): 8697-8705. |
| [14] | NAGEL M, AMJAD R A, VAN BAALEN M, et al. Up or down? Adaptive rounding for post-training quantization[C]∥Proceedings of the 37th International Conference on Machine Learning. New York: ACM, 2020: 7197-7206. |
| [15] | YUAN Z H, XUE C H, CHEN Y Q, et al. PTQ4ViT: Post-training quantization for vision transformers withtwin uniform quantization[C]∥Computer Vision-ECCV 2022. Cham: Springer, 2022: 191-207. |
| [16] | ESSER S K, MCKINSTRY J L, BABLANI D, et al. Learned step size quantization[DB/OL]. arXiv preprint:1902.08153, 2019. |
| [17] | BHALGAT Y, LEE J, NAGEL M, et al. LSQ+: Improving low-bit quantization through learnable offsets and better initialization[C]∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway: IEEE Press, 2020: 2978-2985. |
| [18] | CHOI J, WANG Z, VENKATARAMANI S, et al. Pact: Parameterized clipping activation for quantized neural networks[DB/OL]. arXiv preprint: 1805.06085, 2018. |
| [19] | LIU Z C, CHENG K T, HUANG D, et al. Nonuniform-to-uniform quantization: Towards accurate quantization via generalized straight-through estimation[C]∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2022: 4932-4942. |
| [20] | ZHU K, HE Y Y, WU J X. Quantized feature distillation for network quantization[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(9): 11452-11460. |
| [21] | MUSA A, KAKUDI H A, HASSAN M, et al. Lightweight deep learning models for edge devices: A survey[J]. International Journal of Computer Information Systems and Industrial Management Applications, 2025, 17: 18. |
| [22] | 杨春, 张睿尧, 黄泷, 等. 深度神经网络模型量化方法综述[J]. 工程科学学报, 2023, 45(10): 1613-1629. |
| YANG C, ZHANG R Y, HUANG L, et al. A survey of quantization methods for deep neural networks[J]. Chinese Journal of Engineering, 2023, 45(10): 1613-1629 (in Chinese). | |
| [23] | NAGEL M, FOURNARAKIS M, AMJAD R A, et al. A white paper on neural network quantization[DB/OL]. arXiv preprint: 2106.08295, 2021. |
| [24] | ZHAO X Y, HUANG P, SHU X B. Wavelet-attention CNN for image classification[J]. Multimedia Systems, 2022, 28(3): 915-924. |
| [25] | FINDER S E, AMOYAL R, TREISTER E, et al. Wavelet convolutions for Large receptive fields[C]∥Computer Vision-ECCV 2024. Cham: Springer, 2025: 363-380. |
| [26] | 王晓柱, 钮赛赛, 张凯, 等. 基于小波变换与特征提取的红外弱小目标图像融合[J]. 西北工业大学学报, 2020, 38(4): 723-732. |
| WANG X Z, NIU S S, ZHANG K, et al. Image fusion of infrared weak-small target based on wavelet transform and feature extraction[J]. Journal of Northwestern Polytechnical University, 2020, 38(4): 723-732 (in Chinese). | |
| [27] | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 779-788. |
| [28] | REDMON J, FARHADI A. Yolov3: An incremental improvement[DB/OL]. arXiv preprint: 1804.02767, 2018. |
| [29] | WANG C Y, BOCHKOVSKIY A, LIAO H M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]∥2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2023: 7464-7475. |
| [30] | FINDER S E, AMOYAL R, TREISTER E, et al. Wavelet convolutions for Large receptive fields[C]∥Computer Vision-ECCV 2024. Cham: Springer, 2025: 363-380. |
| [31] | PAN J H, HE C, HUANG W, et al. Wavelet tree transformer: Multihead attention with frequency-selective representation and interaction for remote sensing object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5637023. |
| [32] | GONG R H, LIU X L, JIANG S H, et al. Differentiable soft quantization: Bridging full-precision and low-bit neural networks[C]∥2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2019: 4851-4860. |
| [33] | HUANG L, DONG Z W, CHEN S L, et al. HQOD: Harmonious quantization for object detection[C]∥2024 IEEE International Conference on Multimedia and Expo (ICME). Piscataway: IEEE Press, 2024: 1-6. |
| [1] | Qishuai DING, Bangjun LEI, Zhengping WU. A lightweight single object tracking algorithm for UAVs based on Siamese network [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(4): 330925-330925. |
| [2] | Yiquan WU, Kang TONG. Research advances on deep learning-based small object detection in UAV aerial images [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(3): 30848-030848. |
| [3] | Shuai ZHONG, Liping WANG. MCS-RETR: Improved RT-DETR object detection method for UAV aerial images [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(22): 331987-331987. |
| [4] | Jihong ZHU, Jiacheng HAN, Xiaojun GU, Yahui ZHANG, Jun WANG, Jie HOU, Weihong ZHANG. Advances and challenges in cross-domain vehicle structures and morphing configuration design technologies [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(18): 431686-431686. |
| [5] | Yi ZHENG, Xianghong CHENG, Xingbang TANG, Yi CAO. Oriented detection algorithm for insulator and their defects from aerial images based on improved ReDet [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(18): 331825-331825. |
| [6] | Lin CHEN, Qing ZHU, Han HU, Yulin DING, Pengxin GU. FLASH: Flexible and lightweight awareness of slope hazard [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(11): 531293-531293. |
| [7] | Fanteng MENG, Yong QIN, Jing CUI, Yunpeng WU, Zicheng ZHANG, Shaowei WEI. Unknown risk detection in external environment of railroad using UAV images [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(11): 531262-531262. |
| [8] | Shusheng CHEN, Muliang JIA, Jiahao LIN, Shiyi JIN, Zhenghong GAO, Yueqing WANG, Zhiqiang MA, Zheng LI, Chenlong DUAN, Jiawei LI. Empowering aircraft technology applications with generative models: Research progress and prospects [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(10): 631194-631194. |
| [9] | Xudong LUO, Yiquan WU, Jinlin CHEN. Research progress on deep learning methods for object detection and semantic segmentation in UAV aerial images [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(6): 28822-028822. |
| [10] | Rui SI, Yong CHEN. Application trends of additive manufacturing technology for civil aircraft [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(5): 529677-529677. |
| [11] | Weihong ZHANG, Changhong TANG. Lightweighting of aerospace and aeronautical equipment: Challenges and perspectives [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(5): 529965-529965. |
| [12] | Zi WANG, Jinghao WANG, Yang LI, Zhang LI, Qifeng YU. Non-cooperative target pose estimation from monocular images based on lightweight neural network [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(22): 330248-330248. |
| [13] | Yixuan YOU, Xinchun JI, Dongyan WEI, Yi LU, Hong YUAN. Criterion MP-S for multi-scale cascaded geomagnetic matching similarity measurement [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(21): 330149-330149. |
| [14] | Wenyu WANG, Feng LI, Feixiang REN, Xingyu WEI, Jian XIONG. Research progress on structural design methods and mechanical properties of lightweight high⁃strength composite lattice stiffened shell structure [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(17): 530001-530001. |
| [15] | Junyu LI, Qiankun LIU, Ying FU. Infrared small object detection based on attention mechanism [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(14): 628959-628959. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||
Address: No.238, Baiyan Buiding, Beisihuan Zhonglu Road, Haidian District, Beijing, China
Postal code : 100083
E-mail:hkxb@buaa.edu.cn
Total visits: 6658907 Today visits: 1341All copyright © editorial office of Chinese Journal of Aeronautics
All copyright © editorial office of Chinese Journal of Aeronautics
Total visits: 6658907 Today visits: 1341

