首页 >

遥感图像飞机细粒度目标检测算法研究

周雷1,谷延锋2,刘天竹1   

  1. 1. 哈尔滨工业大学
    2.
  • 收稿日期:2025-07-16 修回日期:2025-09-25 出版日期:2025-10-09 发布日期:2025-10-09
  • 通讯作者: 刘天竹
  • 基金资助:
    国家重点研发计划重点专项项目

Research on aircraft fine-grained object detection algorithm in remote sensing images

  • Received:2025-07-16 Revised:2025-09-25 Online:2025-10-09 Published:2025-10-09

摘要: 遥感图像中的飞机目标往往存在形状近似的特点,尤其是特定型号之间仅存在细微差异,因此如何精确地进行细粒度飞机目标的检测与识别依然是一个挑战。在目前基于深度学习的目标检测方法中,针对模型不同部位提出的各类改进方法能够在一定程度上提升细粒度类别间的检测精度。然而,现有方法忽视了多尺度判别性特征以及类别间隔约束在细粒度任务中的重要性,可能会限制模型从特征表征到特征判别的性能。针对这一问题,提出了一种基于多层级特征交互与正交损失函数的细粒度目标检测模型(HOFD-Net)。该模型能够利用门控融合机制有效融合不同层级的多尺度特征,在多种感受野的注意力机制下提高判别性特征的表征能力,并结合自适应损失项权重在正交损失函数中增强特征的类内紧凑性与类间可分性,从而提升模型对细粒度目标特征的表征与判别能力。在MAR20与SMID两个遥感细粒度目标检测数据集上,进行了多组消融实验及对比实验。实验结果表示,采用该模型在MAR20数据集上最高可以达到61.45%的平均精度均值,相对于基线模型至少提升了0.43%,最多提升了6.29%;在SMID数据集上最高可以达到63.9%的平均精度均值,相对于基线模型至少提升了1.7%;在两个数据集上相对于其他主流算法均达到最佳的精度与性能。

关键词: 遥感图像, 细粒度目标检测, 特征融合, 注意力机制, 正交损失

Abstract: Aircraft targets in remote sensing images often have characteristics of similar shapes, especially the only slight differences between specific models, so that accurately detecting and recognizing fine-grained aircraft targets remains a challenge. Among current deep learning-based object detection methods, various improvements targeting different components of models can enhance detection accuracy between fine-grained categories to some extent. However, existing approaches overlook the importance of multi-scale discriminative features and inter-class separation constraints in fine-grained tasks, potentially limiting model performance from feature representation to feature discrimination. To address this issue, this paper proposes a Hierarchical and Orthogonal Fine-Grained Detection Network(HOFD-Net). The model effectively fuses multi-scale features from different levels using a gated fusion mechanism, enhances the representation capability of discriminative features under attention mechanisms with diverse receptive fields, and incorporates adaptive loss term weighting within the orthogonal loss function to strengthen intra-class compactness and inter-class separability of features. Consequently, the model's capability for representing and discriminating fine-grained target features is improved. Comprehensive ablation studies and comparative experiments were conducted on two remote sensing fine-grained object detection datasets: MAR20 and SMID. Experimental results demonstrate that the proposed model achieves a mean average precision (mAP) of up to 61.45% on the MAR20 dataset, representing an improvement of at least 0.43% and up to 6.29% over the baseline model. On the SMID dataset, it achieves an mAP of up to 63.9%, surpassing the baseline model by a minimum of 1.7%. Furthermore, the model achieves the highest accuracy and performance on both datasets compared to other mainstream algorithms.

Key words: remote sensing images, fine-grained object detection, feature fusion, attention mechanism, orthogonal loss

中图分类号: