一种基于混合专家组的多模态航天遥感图像统一检测模型

doi:10.7527/S1000-6893.2025.32864

Abstract

Abstract:

With the increasing number of remote sensing satellites deployed in orbit in China， the quantity of aerospace remote sensing images， represented by Synthetic Aperture Radar （SAR） and optical （RGB） images， is rapidly growing， along with the demand for tasks such as object detection from these massive datasets. However， due to objective factors such as differences in imaging mechanisms and resolutions， images from different satellites exhibit significant modality feature differences. These differences are particularly pronounced between SAR and RGB remote sensing images， making it difficult for a single model to learn feature information across different types of remote sensing images. As a result， each satellite typically requires a dedicated model for detection tasks， which has become a major obstacle to collaborative recognition and relay detection applications in satellite remote sensing. To address this issue， this paper innovatively proposes a self-distillation multimodal detection model based on a Mixture of Experts （MoE）. First， a modality-aware MoE structure is constructed， employing a small number of high-quality experts as teachers to guide other experts， while simultaneously incorporating modality-invariant constraints to further reduce cross-modality feature shifts. Second， a Fourier-enhanced diffusion detection head is developed， combining frequency-domain feature enhancement to improve the capability of capturing detailed information of detection targets. To evaluate the model performance， aerospace images were selected and cropped from the public datasets FAIR1M and SARDet_100K， resulting in a dataset of 68 983 aerospace remote sensing images for object detection under different backgrounds and imaging mechanisms. Experimental results demonstrate that， compared with existing single-modality detection methods， the proposed model performs better in detection tasks across both modalities， with a significant improvement in mean Average Precision （mAP）. This fully demonstrates that the proposed model possesses significant application value in multimodal aerospace remote sensing image object detection， and exhibits good adaptability to various types of satellite remote sensing images.

Key words: object detection, multimodal aerospace remote sensing images, Mixture of Experts (MoE), self-distillation, Fourier transform, diffusion model

CLC Number:

V279

Yuanjie ZHI, Xin GE, Fan ZHANG, Zhi YANG, Mingyang MA, Shaohui MEI. A unified detection model for multimodal aerospace remote sensing images based on mixture of experts[J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(10): 532864.

Figures/Tables 14

Fig.1

Fig.2

Fig.3

Fig.4

Fig.5

Table 1

Table 2

Fig.6

Table 3

Fig.7

Table 4

Table 5

Table 6

Fig.8

References 26

[1]	GUI S X， SONG S， QIN R J， et al. Remote sensing object detection in the deep learning era—A review［J］. Remote Sensing， 2024， 16（2）： 327.
[2]	DELPLANQUE A， THÉAU J， FOUCHER S， et al. Wildlife detection， counting and survey using satellite imagery： Are we there yet？［J］. GIScience & Remote Sensing， 2024， 61（1）： 2348863.
[3]	高志强，刘纪远. 基于遥感和GIS的中国土地潜力资源的研究［J］. 遥感学报， 2000， 4（2）： 136-140.
	GAO Z Q， LIU J Y. The research of land potential re-sources in China based on remote sensing ＆ GIS ［J］. National Remote Sensing Bulletin， 2000， 4（2）： 136-140 （in Chinese）.
[4]	ZHENG Z， ZHONG Y F， WANG J J， et al. Building damage assessment for rapid disaster response with a deep object based semantic change detection framework： From natural disasters to man-made disasters［J］. Remote Sensing of Environment， 2021， 265： 112636.
[5]	AVTAR R， KOUSER A， KUMAR A， et al. Remote sensing for international peace and security： Its role and implications［J］. Remote Sensing， 2021， 13（3）： 439.
[6]	ADEGUN A A， FONOU DOMBEU J V， VIRIRI S， et al. State-of-the-art deep learning methods for objects detection in remote sensing satellite images［J］. Sensors， 2023， 23（13）： 5849.
[7]	WANG L F， MEI S H， WANG Y， et al. CAMCFormer： Cross-attention and multicorrelation aided transformer for few-shot object detection in optical remote sensing images［J］. IEEE Transactions on Geoscience and Remote Sensing， 2025， 63， 1-16.
[8]	HAN J M， DING J， LI J， et al. Align deep features for oriented object detection［J］. IEEE Transactions on Geoscience and Remote Sensing， 2021， 60： 1-11.
[9]	LIU W， ZHOU L F. Multilevel denoising for high quality SAR object detection in complex scenes［J］. IEEE Transactions on Geoscience and Remote Sensing， 2024， 62： 1-13.
[10]	GAO G， BAI Q L， ZHANG C， et al. Dualistic cascade convolutional neural network dedicated to fully PolSAR image ship detection［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2023， 202： 663-681.
[11]	WANG C， LU W， LI X， et al. M4-SAR： A multi-resolution， multi-polarization， multi-scene， multi-source dataset and benchmark for Optical-SAR fusion object detection［DB/OL］. arXiv preprint： 2505.10931， 2025.
[12]	王子玲，熊振宇，顾祥岐. 可见光与SAR多源遥感图像关联学习算法［J］. 航空学报， 2022， 43（S1）： 727239.
	WANG Z L， XIONG Z Y， GU X Q. Correlation learning algorithm of visible light and SAR cross modal remote sensing images［J］. Acta Aeronautica et Astronautica Sinica， 2022， 43（S1）： 727239 （in Chinese）.
[13]	JACOBS R A， JORDAN M I， NOWLAN S J， et al. Adaptive mixtures of local experts［J］. Neural Computation， 1991， 3（1）： 79-87.
[14]	SHAZEER N， MIRHOSEINI A， MAZIARZ K， et al. Outrageously large neural networks： The sparsely-gated mixture-of-experts layer［DB/OL］. arXiv preprint： 1701.06538， 2017.
[15]	ZHANG L F， SONG J B， GAO A N， et al. Be your own teacher： Improve the performance of convolutional neural networks via self distillation［C］∥Proceedings of the IEEE/CVF international conference on computer vision （ICCV）. Piscataway： IEEE Press， 2019： 3713-3722.
[16]	WANG Z Y， LI Y L， CHEN X， et al. Detecting everything in the open world： Towards universal object detection［C］∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR）. Piscataway： IEEE Press， 2023： 11433-11443.
[17]	XIONG Z T， WANG Y， ZHANG F H， et al. One for all： Toward unified foundation models for Earth vision［C］∥IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium. Piscataway： IEEE Press， 2024： 2734-2738.
[18]	LI Y X， LI X， LI Y H， et al. SM3Det： A unified model for multi-modal remote sensing object detection［DB/OL］. arXiv preprint： 2412.20665， 2024.
[19]	LI Y X， LI X， LI W J， et al. SARDet-100K： Towards open-source benchmark and toolkit for large-scale SAR object detection［C］∥NIPS’24： Proceedings of the 38th International Conference on Neural Information Processing Systems. Curran Associates Inc.， 2024： 128430-128461.
[20]	SUN X， WANG P J， YAN Z Y， et al. FAIR1M： A benchmark dataset for fine-grained object recognition in high-resolution remote sensing imagery［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2022， 184： 116-130.
[21]	LI W T， ZHAO D P， YUAN B， et al. PETDet： Proposal enhancement for two-stage fine-grained object detection［J］. IEEE Transactions on Geoscience and Remote Sensing， 2023， 62： 1-14.
[22]	HOU X Q， LIU M Q， ZHANG S L， et al. Relation DETR： Exploring explicit position relation prior for object detection［C］∥European Conference on Computer Vision （ECCV）. Cham： Springer Nature Switzerland， 2024： 89-105.
[23]	ZHAO J Q， DING Z Y， ZHOU Y， et al. OrientedFormer： An end-to-end transformer-based oriented object detector in remote sensing images［J］. IEEE Transactions on Geoscience and Remote Sensing， 2024， 62： 1-16.
[24]	DAI Y M， ZOU M R， LI Y X， et al. DenoDet： Attention as deformable multi-subspace feature denoising for target detection in SAR images［J］. IEEE Transactions on Aerospace and Electronic Systems， 2024， 61： 4729-4743.
[25]	ZHOU J， XIAO C， PENG B， et al. DiffDet4SAR： Diffusion-based aircraft target detection network for SAR images［J］. IEEE Geoscience and Remote Sensing Letters， 2024， 21： 1-5.
[26]	LI W J， YANG W， HOU Y N， et al. SARATR-X： Towards building a foundation model for SAR target recognition［J］. IEEE Transactions on Image Processing， 2025， 34： 869-884.

模态	方法	发表年份	mAP@50	mAP@75	mAP	Params/M	GFLOPs	FPS
RGB	PETDet^［21］	2023	67.5	44.3	42.7	51.1	128	18.5
	R-DETR^［22］	2024	72.1	46.2	44.1	48.9	184.2	5.6
	OrientedFormer^［23］	2024	71.8	45.8	43.5	44.8	21.6	32.3
	SM3Det^［18］	2024	70.9	45.4	43.9	444.3	331.8	8.8
	本文方法	2025	72.5	45.9	44.2	443.5	312.8	9.2
SAR	DiffDet4SAR^［25］	2024	85.1	60.9	52.3	111.2	39.0	7.3
	SARatrX^［26］	2024	87.7	62.8	56.3	69.1	37.1	6.4
	GrokSAR^［24］	2025	86.0	63.8	58.6	65.8	21.6	35.0
	SM3Det^［18］	2024	86.1	62.1	57.2	444.2	157.8	8.8
	本文方法	2025	87.9	63.9	58.8	421.9	121.4	9.2

模态	方法	mAP@50在SAR模态数据集	mAP@50在RGB模态数据集
原为RGB模态	PETDet^［21］	25.7	67.5
	R-DETR^［22］	70.9	72.1
	OrientedFormer^［23］	20.0	71.8
原为SAR模态	DiffDet4SAR^［25］	85.1	62.3
	GrokSAR^［24］	87.7	65.1
	SARatrX^［26］	86.0	64.2
统一模型	SM3Det^［18］	86.1	70.9
统一模型	本文方法	87.9	72.5

B	S	M	F	mAP@50
B	S	M	F	SAR	RGB
√	√			86.0	71.0
√		√		87.1	71.0
√			√	86.9	71.9
√	√	√		86.9	70.9
√	√		√	87.3	71.7
√		√	√	87.2	72.1
√	√	√	√	87.9	72.5

缺失模态	mAP	mAP@50	mAP@75
SAR	43.4	71.8	45.3
RGB	58.4	87.3	62.9

模态	检测头	mAP@50
RGB	O-RCNN	70.9
	F-RCNN	70.7
	FCOS	69.9
	本文方法	72.5
SAR	GFLHead	86.1
	Cascade	85.4
	FCOS	85.9
	本文方法	87.9

A unified detection model for multimodal aerospace remote sensing images based on mixture of experts

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 14

References 26

Related Articles 15

Recommended Articles

Metrics

Comments

[1]	Wenlin LIU, Xikun HU, Ping ZHONG. Reinforcement learning-driven object detection method for degraded remote sensing images [J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(10): 532861-532861.
[2]	Feiming WANG, Menglin LI, Yang QU, Bin PAN. Diffusion super-resolution of SAR images integrating wavelet guidance and structural enhancement [J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(10): 532804-532804.
[3]	Yiquan WU, Kang TONG. Research advances on deep learning-based small object detection in UAV aerial images [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(3): 30848-030848.
[4]	Zhenhao CHENG, Xiaogang YANG, Ruitao LU, Tao ZHANG, Siyu WANG. Multi-stage distillation for incremental detection of time-sensitive targets in UAV images [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(24): 331959-331959.
[5]	Wei HUANG, Jiahao PAN, Chu HE. Wavelet time-frequency localization-based model compression for UAV object detection [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 631952-631952.
[6]	Kui LIU, Hao SUN, Han WU, Kefeng JI, Gangyao KUANG. Dynamic brightness reconstruction for UAV visible-infrared fusion object detection [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 631968-631968.
[7]	Bo PENG, Jikang BAI, Weiwen CHEN, Xiangtao ZHENG, Jianjun LEI, Xiaoqiang LU. Research progress for UAV search and rescue methods based on deep learning [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 632761-632761.
[8]	Fei WANG, Yong LIU, Jiawei YAO, Xuanlei ZHU, Xiaoqiang LU, Wenxing GUO, Xuetao ZHANG, Yu GUO. RS-AdaDiff: One-step remote sensing image super-resolution diffusion model with degradation-aware adaptive estimation [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 632763-632763.
[9]	Shuai ZHONG, Liping WANG. MCS-RETR: Improved RT-DETR object detection method for UAV aerial images [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(22): 331987-331987.
[10]	Yi ZHENG, Xianghong CHENG, Xingbang TANG, Yi CAO. Oriented detection algorithm for insulator and their defects from aerial images based on improved ReDet [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(18): 331825-331825.
[11]	Fanteng MENG, Yong QIN, Jing CUI, Yunpeng WU, Zicheng ZHANG, Shaowei WEI. Unknown risk detection in external environment of railroad using UAV images [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(11): 531262-531262.
[12]	Shusheng CHEN, Muliang JIA, Jiahao LIN, Shiyi JIN, Zhenghong GAO, Yueqing WANG, Zhiqiang MA, Zheng LI, Chenlong DUAN, Jiawei LI. Empowering aircraft technology applications with generative models: Research progress and prospects [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(10): 631194-631194.
[13]	Jing WANG, Wei LIU, Hairun XIE, Miao ZHANG, Tuliang MA. Diffusion model-driven multi-objective generative design of supercritical airfoils [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(10): 631210-631210.
[14]	Ruitao ZHANG, Cong WANG, Jun TAO, Liyue WANG, Gang SUN. Airfoil parameterization method based on latent diffusion model [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(10): 631180-631180.
[15]	Youtao XUE, Shaobo YAO, Yuxin YANG, Yi DUAN, Wenwen ZHAO, Haoge LI. Generalization of three-dimensional flight vehicle shape representation using generative models [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(10): 631511-631511.