航空学报 > 2026, Vol. 47 Issue (10): 532864-532864   doi: 10.7527/S1000-6893.2025.32864

航天遥感图像智能处理与分析专刊

一种基于混合专家组的多模态航天遥感图像统一检测模型

支元杰1, 葛欣1, 张帆2, 杨知3, 马明阳1, 梅少辉1()   

  1. 1.西北工业大学 电子信息学院,西安 710129
    2.中国运载火箭技术研究院,北京 100076
    3.国网电力工程研究院有限公司,北京 102209
  • 收稿日期:2025-10-09 修回日期:2025-11-06 接受日期:2025-12-11 出版日期:2025-12-25 发布日期:2025-12-23
  • 通讯作者: 梅少辉 E-mail:meish@nwpu.edu.cn
  • 基金资助:
    国家自然科学基金(62571442);国家自然科学基金(62006193)

A unified detection model for multimodal aerospace remote sensing images based on mixture of experts

Yuanjie ZHI1, Xin GE1, Fan ZHANG2, Zhi YANG3, Mingyang MA1, Shaohui MEI1()   

  1. 1.School of Electronic Information,Northwestern Polytechnical University,Xi’an 710129,China
    2.China Academy of Launch Vehicle Technology,Beijing 100076,China
    3.State Grid Electric Power Engineering Research Institute Co. ,Ltd. ,Beijing 102209,China
  • Received:2025-10-09 Revised:2025-11-06 Accepted:2025-12-11 Online:2025-12-25 Published:2025-12-23
  • Contact: Shaohui MEI E-mail:meish@nwpu.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62571442)

摘要:

随着中国在轨部署的遥感卫星数量逐年增加,以合成孔径雷达与可见光图像为代表的航天遥感图像数量正在快速增多,从这些海量数据中开展目标检测等任务的需求也在快速上升。然而,由于成像机制、分辨率差异等客观因素的限制,不同卫星之间的图像存在着明显的模态特征差异,这种差异在合成孔径雷达(SAR)和可见光(RGB)的遥感图像之间表现得尤为显著,使得单一模型难以学习不同类型遥感图像的特征信息,进而导致每颗卫星都需要相应的专用模型以进行检测任务,这已成为卫星遥感图像协同识别和接力探测应用的主要障碍。针对这一问题,创新性地提出基于混合专家组(MoE)的自蒸馏多模态检测模型。首先,构建基于模态感知的MoE架构,将结构中的少数高质量专家作为教师以指导其他专家,同时结合模态不变性约束,进一步减小跨模态特征偏移。其次,通过构建傅里叶增强扩散检测头,结合频域特征增强,提升了对检测目标的细节捕捉能力。为测试模型性能,从公共数据集FAIR1M和SARDet_100K中分别选取其中的航天图像并进行裁剪处理,得到包含不同背景和成像机制下用于目标检测的68 983张航天遥感图像的数据集。实验结果显示,与已有的单模态检测方法相比,所提模型在两类模态目标检测任务中表现更优,平均精度(mAP)有显著提升,这充分证明了所提模型在多模态航天遥感图像目标检测上有较好的应用价值,在多类卫星遥感图像上均有较好的适用性。

关键词: 目标检测, 多模态航天遥感图像, 混合专家组, 自蒸馏, 傅里叶变换, 扩散模型

Abstract:

With the increasing number of remote sensing satellites deployed in orbit in China, the quantity of aerospace remote sensing images, represented by Synthetic Aperture Radar (SAR) and optical (RGB) images, is rapidly growing, along with the demand for tasks such as object detection from these massive datasets. However, due to objective factors such as differences in imaging mechanisms and resolutions, images from different satellites exhibit significant modality feature differences. These differences are particularly pronounced between SAR and RGB remote sensing images, making it difficult for a single model to learn feature information across different types of remote sensing images. As a result, each satellite typically requires a dedicated model for detection tasks, which has become a major obstacle to collaborative recognition and relay detection applications in satellite remote sensing. To address this issue, this paper innovatively proposes a self-distillation multimodal detection model based on a Mixture of Experts (MoE). First, a modality-aware MoE structure is constructed, employing a small number of high-quality experts as teachers to guide other experts, while simultaneously incorporating modality-invariant constraints to further reduce cross-modality feature shifts. Second, a Fourier-enhanced diffusion detection head is developed, combining frequency-domain feature enhancement to improve the capability of capturing detailed information of detection targets. To evaluate the model performance, aerospace images were selected and cropped from the public datasets FAIR1M and SARDet_100K, resulting in a dataset of 68 983 aerospace remote sensing images for object detection under different backgrounds and imaging mechanisms. Experimental results demonstrate that, compared with existing single-modality detection methods, the proposed model performs better in detection tasks across both modalities, with a significant improvement in mean Average Precision (mAP). This fully demonstrates that the proposed model possesses significant application value in multimodal aerospace remote sensing image object detection, and exhibits good adaptability to various types of satellite remote sensing images.

Key words: object detection, multimodal aerospace remote sensing images, Mixture of Experts (MoE), self-distillation, Fourier transform, diffusion model

中图分类号: