Current deep learning-based remote sensing object detection models rely on the powerful computing and storage re-sources of ground servers, enabling efficient processing of massive remote sensing data. However, in deployment scenarios for mobile edge devices such as Unmanned Aerial Vehicles (UAVs), the limited computational resources, storage capacity, and energy constraints make it challenging to effectively deploy large-scale models. Model compression techniques, such as lightweight design and model quantization, have become critical for enabling the practical application of remote sensing object detection algorithms. This paper proposes a wavelet time-frequency localization-based model compression framework, using the single-stage object de-tection network as the baseline: (1) By integrating depthwise separable convolution with the spatial-frequency localization charac-teristics of discrete wavelet transform, a Wavelet Depthwise Separable Convolution (W-DSConv) module is constructed to achieve lightweight model reconstruction while expanding its receptive field; (2) Leveraging wavelet frequency domain decomposition, a Wavelet Frequency-Division Quantization (W-FDQ) method for quantization-aware training is proposed, enabling independent quantization of features across different frequency bands to further compress the lightweight model. Experiments are conducted using YOLO-series models on the VisDrone2021 UAV remote sensing dataset. Results demonstrate that: (1) The W-DSConv module reduces model parameters by 45.5% and computational load by 37.4%, while limiting detection accuracy fluctuations to within 2.2%; (2) When applying 6-bit and 4-bit quantization via W-FDQ, the quantized models retain 95.8% and 92.9% of the floating-point model’s detection performance, respectively. This research provides novel technical insights for lightweight deployment of remote sensing object detection models on mobile platforms.
[1]江波, 屈若锟, 李彦冬, 等.基于深度学习的无人机航拍目标检测研究综述[J].航空学报, 2021, 42(4):524519-524519
[2]JIANG B, QU R K, LI Y D, et al.Object detection in UAV imagery based on deep learning: Review[J].ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(4):524519-524519
[3]欧阳权, 张怡, 马延, 等.基于深度学习的无人机航拍目标检测与跟踪方法综述[J].电光与控制, 2024, 31(3):1-7
[4]OUYANG Q, ZHANG Y, MA Y, et al.A Review of UAV Aerial Photography Target Detection and Tracking Methods Based on Deep Learning[J].Electronics Optics & Control, 2024, 31(3):1-7
[5]赵禄达, 胡以华, 赵楠翔, 等.点云深度学习模型的压缩和部署加速方法研究现状与展望特邀[J].激光与光电子学进展, 2024, 61(20):2011005-2011005
[6]ZHAO L D, HU Y H, ZHAO N X, et al.Review of Model Compression and Accelerated Development for Deep Learning in LiDAR Point Cloud Processing (Invited)[J].Laser & Optoelectronics Progress, 2024, 61(20):2011005-2011005
[7]CHEN F H, LI S L, HAN J L, et al.Review of light-weight deep convolutional neural networks[J].Archives of Computational Methods in Engineering, 2024, 31(4):1915-1937
[8]王军, 冯孙铖, 程勇.深度学习的轻量化神经网络结构研究综述[J].计算机工程, 2021, 47(8):1-13
[9]WANG J, FENG S C, CHENG Y.Survey of Research on Lightweight Neural Network Structures for Deep Learning[J].Computer Engineering, 2021, 47(8):1-13
[10]SIFRE L, MALLAT S. Rigid-motion scattering for tex-ture classification[J]. arXiv prep.[J].rXiv:1403.1687, 2014., rint, :-
[11]HOWARD A G, ZHU M L, CHEN B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv prep.[J].rXiv:1704.04861, 2017., rint, :-
[12]SANDLER M, HOWARD A, ZHU M L, et al.Mo-bilenetv2: Inverted residuals and linear bottle-necks[C]//Proceedings of the IEEE conference on com-puter vision and pattern recognition. 2018: 4510-4520.
[13]ZHANG X Y, ZHOU X Y, LIN M X, et al.Shufflenet: An extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 6848-6856.
[14]HAN K, WANG Y H, TIAN Q, et al.Ghostnet: More features from cheap operations[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 1580-1589.
[15]VASU P K A, GABRIEL J, ZHU J, et al.Mobileone: An improved one millisecond mobile back-bone[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023: 7907-7917.
[16]HAN S, MAO H Z, DALLY W J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding[J]. arXiv prep.[J].rXiv:1510.00149, 2015., rint, :-
[17]LIU X C, YE M, ZHOU D Y, et al.Post-training quanti-zation with multiple points: Mixed precision without mixed precision[C]//Proceedings of the AAAI confer-ence on artificial intelligence. 2021, 35(10): 8697-8705.
[18]NAGEL M, AMJAD R A, VAN BAALEN M, et al.Up or down? adaptive rounding for post-training quantiza-tion[C]//International conference on machine learning. PMLR, 2020: 7197-7206.
[19]YUAN Z H, XUE C H, CHEN Y Q, et al.Ptq4vit: Post-training quantization for vision transformers with twin uniform quantization[C]//European conference on com-puter vision. Cham: Springer Nature Switzerland, 2022: 191-207.
[20]ESSER S K, MCKINSTRY J L, BABLANI D, et al. Learned step size quantization[J]. arXiv prep.[J].rXiv:1902.08153, 2019., rint, :-
[21]BHALGAT Y, LEE J, NAGEL M, et al.Lsq+: Improv-ing low-bit quantization through learnable offsets and better initialization[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 2020: 696-697.
[22]CHOI J, WANG Z, VENKATARAMANI S, et al. Pact: Parameterized clipping activation for quantized neural networks[J]. arXiv prep.[J].rXiv:1805.06085, 2018., rint, :-
[23]LIU Z C, CHENG K T, HUANG D, et al.Nonuniform-to-uniform quantization: Towards accurate quantization via generalized straight-through estima-tion[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022: 4942-4952.
[24]ZHU K, HE Y Y, WU J X.Quantized feature distillation for network quantization[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2023, 37(9): 11452-11460.
[25]MUSA A, KAKUDI H A, HASSAN M, et al.Light-weight Deep Learning Models For Edge Devices—A Survey[J].International Journal of Computer Infor-mation Systems and Industrial Management Applications, 2025, 17:18-
[26]杨春, 张睿尧, 黄泷, 等.深度神经网络模型量化方法综述[J].工程科学学报, 2023, 45(10):1613-1629
[27]YANG C, ZHANG R Y, HUANG L, et al.A survey of quantization methods for deep neural net-works[J].Chinese Journal of Engineering, 2023, 45(10):1613-1629
[28]NAGEL M, FOURNARAKIS M, AMJAD R A, et al. A white paper on neural network quantization[J]. arXiv prep.[J].rXiv:2106.08295, 2021., rint, :-
[29]REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once: Unified, real-time object detection[C] //2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 779-788.
[30]REDMON J, FARHADI A. Yolov3: An incremental improvement[J]. arXiv prep.[J].rXiv:1804.02767, 2018., rint, :-
[31]WANG C Y, BOCHKOVSKIY A, LIAO H Y M.YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF conference on computer vision and pat-tern recognition. 2023: 7464-7475.
[32]FINDER S E, AMOYAL R, TREISTER E, et al.Wavelet convolutions for large receptive fields[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024: 363-380.
[33]PAN J H, HE C, HUANG W, et al.Wavelet Tree Trans-former: Multi-Head Attention with Frequency Selective Representation and Interaction for Remote Sensing Object Detection[J].IEEE Transactions on Geoscience and Remote Sensing, 2024, 62:1-23
[34]GONG R H, LIU X L, JIANG S H, et al.Differentiable soft quantization: Bridging full-precision and low-bit neural networks[C]//Proceedings of the IEEE/CVF inter-national conference on computer vision. 2019: 4852-4861.
[35]HUANG L, DONG Z W, CHEN S L, et al.HQOD: Harmonious Quantization for Object Detection[C]//2024 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2024: 1-6.