航空学报 > 2026, Vol. 47 Issue (10): 632578-632578   doi: 10.7527/S1000-6893.2025.32578

航天遥感图像智能处理与分析专刊

遥感图像飞机细粒度目标检测算法

周雷, 谷延锋, 刘天竹()   

  1. 哈尔滨工业大学 电子与信息工程学院,哈尔滨 150006
  • 收稿日期:2025-07-16 修回日期:2025-08-08 接受日期:2025-09-15 出版日期:2025-10-10 发布日期:2025-10-09
  • 通讯作者: 刘天竹 E-mail:tzliu@hit.edu.cn
  • 基金资助:
    国家重点研发计划重点专项(2024YFF1401002)

Aircraft fine-grained object detection algorithm in remote sensing images

Lei ZHOU, Yanfeng GU, Tianzhu LIU()   

  1. School of Electronics and Information Engineering,Harbin Institute of Technology,Harbin 150006,China
  • Received:2025-07-16 Revised:2025-08-08 Accepted:2025-09-15 Online:2025-10-10 Published:2025-10-09
  • Contact: Tianzhu LIU E-mail:tzliu@hit.edu.cn
  • Supported by:
    National Key Research and Development Program of China(2024YFF1401002)

摘要:

遥感图像中的飞机目标往往存在形状近似的特点,尤其是特定型号之间仅存在细微差异,因此如何精确地进行细粒度飞机目标的检测与识别依然是一个挑战。在目前基于深度学习的目标检测方法中,针对模型不同部位提出的各类改进方法能够在一定程度上提升细粒度类别间的检测精度。然而,现有方法忽视了多尺度判别性特征以及类别间隔约束在细粒度任务中的重要性,可能会限制模型从特征表征到特征判别的性能。针对此问题,提出了一种基于多层级特征交互与正交损失函数的细粒度目标检测模型。该模型能够利用门控融合机制有效融合不同层级的多尺度特征,在多种感受野的注意力机制下提高判别性特征的表征能力,并结合自适应损失项权重在正交损失函数中增强特征的类内紧凑性与类间可分性,从而提升模型对细粒度目标特征的表征与判别能力。在MAR20与SMID 2个遥感细粒度目标检测数据集上,进行了多组消融实验及对比实验。实验结果表示,采用该模型在MAR20数据集上最高可以达到61.45%的平均精度均值,相对于基线模型至少提升了0.43%,最多提升了6.29%;在SMID数据集上最高可以达到63.9%的平均精度均值,相对于基线模型至少提升了1.7%;在2个数据集上相对于其他主流算法均达到最佳的精度与性能。

关键词: 遥感图像, 目标检测, 特征融合, 注意力机制, 正交损失

Abstract:

Aircraft targets in remote sensing images often have characteristics of similar shapes, especially the only slight differences between specific models, so that accurately detecting and recognizing fine-grained aircraft targets remains a challenge. Among current deep learning-based object detection methods, various improvements targeting different components of models can enhance detection accuracy between fine-grained categories to some extent. However, existing approaches overlook the importance of multi-scale discriminative features and inter-class separation constraints in fine-grained tasks, potentially limiting model performance from feature representation to feature discrimination. To address this issue, this paper proposes a Hierarchical and Orthogonal Fine-Grained Detection Network. The model effectively fuses multi-scale features from different levels using a gated fusion mechanism, enhances the representation capability of discriminative features under attention mechanisms with diverse receptive fields, and incorporates adaptive loss term weighting within the orthogonal loss function to strengthen intra-class compactness and inter-class separability of features. Consequently, the model’s capability for representing and discriminating fine-grained target features is improved. Comprehensive ablation studies and comparative experiments were conducted on two remote sensing fine-grained object detection datasets: MAR20 and SMID. Experimental results demonstrate that the proposed model achieves a mean average precision of up to 61.45% on the MAR20 dataset, representing an improvement of at least 0.43% and up to 6.29% over the baseline model and a mean average precision of up to 63.9% on the SMID dataset, surpassing the baseline model by a minimum of 1.7%. Across both datasets, the proposed model achieves the highest accuracy and performance compared to other mainstream algorithms.

Key words: remote sensing images, object detection, feature fusion, attention mechanism, orthogonal loss

中图分类号: