航天遥感图像智能处理与分析专刊

基于多视角低空遥感图像的船舶目标关联

  • 刘云鹤 ,
  • 姜智卓 ,
  • 刘瑜 ,
  • 孙显 ,
  • 何友
展开
  • 1.清华大学 深圳国际研究生院,深圳 518000
    2.南开大学 计算机学院,天津 300071
    3.清华大学 电子工程系,北京 100084
    4.中国科学院 空天信息创新研究院,北京 100094
    5.中国科学院 网络信息体系技术重点实验室,北京 100190

收稿日期: 2025-11-07

  修回日期: 2025-12-03

  录用日期: 2026-01-04

  网络出版日期: 2026-01-15

基金资助

国家自然科学基金(62401335)

Vessel target association based on multi-view low-altitude remote sensing images

  • Yunhe LIU ,
  • Zhizhuo JIANG ,
  • Yu LIU ,
  • Xian SUN ,
  • You HE
Expand
  • 1.Tsinghua Shenzhen International Graduate School,Tsinghua University,Shenzhen 518000,China
    2.College of Computer Science,Nankai University,Tianjin 300071,China
    3.Department of Electronic Engineering,Tsinghua University,Beijing 100084,China
    4.Aerospace Information Research Institute,Chinese Academy of Sciences,Beijing 100094,China
    5.Key Laboratory of Network Information System Technology,Chinese Academy of Sciences,Beijing 100190,China

Received date: 2025-11-07

  Revised date: 2025-12-03

  Accepted date: 2026-01-04

  Online published: 2026-01-15

Supported by

National Natural Science Foundation of China(62401335)

摘要

低空遥感场景下的船舶目标关联技术是推动海上监测及其智能感知系统发展的重要支撑。然而,现有方法多直接迁移行人或车辆重识别算法,难以有效应对船舶图像中的特有问题,尤其是因无人机等低空遥感平台成像视角多变导致的类内差异大、局部信息缺失等挑战,这往往导致同一船舶目标出现异常样本,极大影响关联精度。为了解决上述问题,提出一种基于多尺度相关性Transformer网络的船舶目标关联算法。与现有方法不同,该算法能够同时对输入图像集合进行多尺度显式的全局和局部相关性建模,且在模型训练时,不只依赖单幅图像的孤立特征进行学习,而是融合利用图像间的互补信息,抑制由类内差异或局部缺失引起的异常样本影响。具体而言,设计了全局关联模块,构建完整输入图像间的全局相似性关联矩阵,基于图像间一致性进行特征聚合,实现显式全局相关性建模;同时设计了局部关联模块,构建一个基于动态更新机制的记忆库,挖掘并对齐正样本的局部特征,通过上下文相似性提取局部相关性。在4个公开实测数据集上的实验结果表明:所提方法在目标关联准确度的性能指标上均优于现有主流方法,验证了其有效性、鲁棒性与工程实用潜力。

本文引用格式

刘云鹤 , 姜智卓 , 刘瑜 , 孙显 , 何友 . 基于多视角低空遥感图像的船舶目标关联[J]. 航空学报, 2026 , 47(10) : 533060 -533060 . DOI: 10.7527/S1000-6893.2026.33060

Abstract

Vessel target association under low-altitude remote-sensing scenarios is a crucial component supporting the development of maritime monitoring and intelligent perception systems. However, most existing approaches directly migrate pedestrian or vehicle re-identification algorithms, which fail to effectively handle the unique challenges of vessel imagery-particularly the large intra-class variations and local information loss caused by the diverse imaging perspectives of UAV-based low-altitude imaging platforms. These issues often lead to outlier samples within the same vessel identity, significantly degrading association accuracy. To overcome these limitations, this paper proposes a Multi-scale Correlation-aware Transformer network (MCFormer) for vessel target association. Unlike conventional methods that learn from isolated features of single images, MCFormer performs explicit global and local correlation modeling across multi-scale image collections, leveraging inter-image complementary information to suppress the effects of intra-identity variance and partial occlusion. Specifically, a Global Correlation Module (GCM) constructs a comprehensive inter-image similarity matrix to achieve explicit global correlation modeling through consistency-based feature aggregation, while a Local Correlation Module (LCM) builds a dynamically updated memory bank to mine and align positive local features, capturing fine-grained contextual correlations. Experiments conducted on four publicly available real-world datasets demonstrate that the proposed method consistently outperforms mainstream method in performance metrics related to target association accuracy, verifying its effectiveness, robustness, and engineering potential.

参考文献

[1] JIA Z Y, ZHU Y A, WU Q H, et al. Remote ID based UAV collision avoidance optimization for low-altitude airspace safety[J]. Chinese Journal of Aeronautics2025: 103841.
[2] REN K, DING L, WAN M J, et al. Target localization based on cross-view matching between UAV and satellite[J]. Chinese Journal of Aeronautics202235(9): 333-341.
[3] JIN R J, WANG K, LI Z, et al. SAA-O2DINO: Oriented object detection transformer with improved denoising anchor boxes and shape-adaptive assigner[J]. Chinese Journal of Aeronautics2025: 103782.
[4] WANG Y, LI H G, LI X J, et al. UAV image target localization method based on outlier filter and frame buffer[J]. Chinese Journal of Aeronautics202437(7): 375-390.
[5] WEN Z D, WU J H, LV Y F, et al. Cross-modality vessel re-identification with deep alignment decomposition network[J]. IEEE Transactions on Multimedia202426: 10318-10330.
[6] ZHANG Q, YAN Y M, GAO L, et al. A third-modality collaborative learning approach for visible-infrared vessel reidentification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing202417: 19035-19047.
[7] LU Z, SUN L G, LV P, et al. A new large-scale dataset for marine vessel re-identification based on Swin transformer network in ocean surveillance scenario[J]. IET Computer Vision202519(1): e70007.
[8] 熊振宇, 崔亚奇, 董凯, 等. 基于属性引导的多源遥感舰船目标可解释融合关联网络[J]. 航空学报202344(22): 627476.
  XIONG Z Y, CUI Y Q, DONG K, et al. Interpretable fusion association network for multi-source remote sensing ship target based on attribute guidance[J]. Acta Aeronautica et Astronautica Sinica202344(22): 627476 (in Chinese).
[9] LU F, LAN X Y, ZHANG L J, et al. CricaVPR: Cross-image correlation-aware representation learning for visual place recognition[C]∥2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 16772-16782.
[10] ZHANG B, YUAN J K, LI B P, et al. Learning cross-image object semantic relation in transformer for few-shot fine-grained image classification[C]∥Proceedings of the 30th ACM International Conference on Multimedia. New York: ACM, 2022: 2135-2144.
[11] HAN X R, CHEN Z Q, WANG R X, et al. Joint label refinement and contrastive learning with hybrid memory for unsupervised marine object re-identification[C]∥Proceedings of the 3rd ACM International Conference on Multimedia in Asia. New York: ACM, 2022.
[12] WANG H, LI S Y, YANG J, et al. Cross-modal ship re-identification via optical and SAR imagery: A novel dataset and method[DB/OL]. arXiv preprint: 2506.22027, 2025.
[13] DOU W H, ZHU L M, WANG Y, et al. Research on key technology of ship re-identification based on the USV-UAV collaboration[J]. Drones20237(9): 590.
[14] SPAGNOLO P, FILIERI F, DISTANTE C, et al. A new annotated dataset for boat detection and re-identification[C]∥2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). Piscataway: IEEE Press, 2019: 1-7.
[15] ZENG G M, WANG R J, YU W N, et al. A transfer learning-based approach to maritime warships re-identification[J]. Engineering Applications of Artificial Intelligence2023125: 106696.
[16] ZHANG Q, ZHANG M X, LIU J H, et al. Unsupervised maritime vessel re-identification with multi-level contrastive learning[J]. IEEE Transactions on Intelligent Transportation Systems202324(5): 5406-5418.
[17] XU C A, GAO L, LIU Y, et al. CMShipReID: A cross-modality ship dataset for the reidentification task[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing202518: 10503-10513.
[18] LI Y P, LIU Y Z, ZHANG H Y, et al. Occlusion-aware transformer with second-order attention for person re-identification[J]. IEEE Transactions on Image Processing202433: 3200-3211.
[19] LI Y P, MIAO D Q, ZHANG H Y, et al. Multi-granularity cross transformer network for person re-identification[J]. Pattern Recognition2024150: 110362.
[20] ZHENG X T, HUANG X H, JI C, et al. Multi-modal person re-identification based on transformer relational regularization[J]. Information Fusion2024103: 102128.
[21] 王潇, 刘贞报. 基于多层多向Transformer的红外弱小目标检测[J]. 航空学报202445(14): 629490.
  WANG X, LIU Z B. Infrared small target detection based on multi-layer multi-direction transformer[J]. Acta Aeronautica et Astronautica Sinica202445(14): 629490 (in Chinese).
[22] 刘芳, 卢晨阳, 路言, 等. 基于自适应模板更新的Transformer无人机目标跟踪算法[J]. 航空学报202546(16): 331687.
  LIU F, LU C Y, LU Y, et al. Adaptive template update-based Transformer algorithm for UAV target tracking[J]. Acta Aeronautica et Astronautica Sinica202546(16): 331687 (in Chinese).
[23] HE S T, LUO H, WANG P C, et al. TransReID: Transformer-based object re-identification[C]∥2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2022: 14993-15002.
[24] NI H, LI Y K, GAO L L, et al. Part-aware transformer for generalizable person re-identification[C]∥2023 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2024: 11246-11255.
[25] WANG H C, SHEN J Y, LIU Y T, et al. NFormer: Robust person re-identification with neighbor transformer[C]∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2022: 7287-7297.
[26] ZHANG Z, ZHANG H J, LIU S. Person re-identification using heterogeneous local graph attention networks[C]∥2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2021: 12131-12140.
[27] ZHU H W, KE W J, LI D, et al. Dual cross-attention learning for fine-grained visual categorization and object re-identification[C]∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2022: 4682-4692.
[28] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[DB/OL]. arXiv Preprint2010.11929, 2020.
[29] SUN Y F, ZHENG L, YANG Y, et al. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline)[C]∥Computer Vision – ECCV 2018. Cham: Springer, 2018: 501-518.
[30] WANG T, LIU H, SONG P H, et al. Pose-guided feature disentangling for occluded person re-identification based on transformer[J]. Proceedings of the AAAI Conference on Artificial Intelligence202236(3): 2540-2549.
[31] CHEN G Y, GU T P, LU J W, et al. Person re-identification via attention pyramid[J]. IEEE Transactions on Image Processing202130: 7663-7676.
[32] DONG N, YAN S L, TANG H, et al. Multi-view information integration and propagation for occluded person re-identification[J]. Information Fusion2024104: 102201.
[33] XU H, HUANG L H, CHEN Y T, et al. Unsupervised military-civilian cross-domain vessel re-identification using improved momentum contrast learning[C]∥2024 International Conference on New Trends in Computational Intelligence (NTCI). Piscataway: IEEE Press, 2024: 106-110.
[34] ARAIN T ALI, ZHANG P C, MENG Q, et al. Swin transformer with attention mechanism: A novel framework for person re-identification[J]. Pattern Analysis and Applications202528(2): 95.
[35] WEN Y D, ZHANG K P, LI Z F, et al. A discriminative feature learning approach for deep face recognition[C]∥Computer Vision-ECCV 2016. Cham: Springer, 2016: 499-515.
[36] DAI Z Z, WANG G Y, YUAN W H, et al. Cluster contrast for unsupervised person re-identification[C]∥Computer Vision-ACCV 2022. Cham: Springer, 2023: 319-337.
文章导航

/