封面文章

基于属性引导的多源遥感舰船目标可解释融合关联网络

  • 熊振宇 ,
  • 崔亚奇 ,
  • 董凯 ,
  • 李孟洋 ,
  • 熊伟
展开
  • 海军航空大学 信息融合研究所,烟台  264001
.E-mail: x_zhen_yu@163.com

收稿日期: 2022-05-20

  修回日期: 2022-06-21

  录用日期: 2022-07-29

  网络出版日期: 2022-08-03

基金资助

国家青年科学基金(62001499);国家自然科学基金(61790554)

Interpretable fusion association network for multi-source remote sensing ship target based on attribute guidance

  • Zhenyu XIONG ,
  • Yaqi CUI ,
  • Kai DONG ,
  • Mengyang LI ,
  • Wei XIONG
Expand
  • Institute of Information Fusion,Naval Aviation University,Yantai  264001,China
E-mail: x_zhen_yu@163.com

Received date: 2022-05-20

  Revised date: 2022-06-21

  Accepted date: 2022-07-29

  Online published: 2022-08-03

Supported by

National Science Fund for Young Scholars(62001499);National Natural Science Foundation of China(61790554)

摘要

多源遥感舰船目标关联作为前期大范围预警探测的重要手段为海上态势研判提供重要情报支撑,现有关联算法面临关联结果可解释性差,异构特征难度量,多源目标关联精度低等问题。提出了一种基于属性引导的可解释融合网络用于解决多源遥感舰船目标的关联问题。首先,提出全局关联模块,利用跨模态度量损失函数将图像特征映射到共同空间中度量,用于解决多源图像内容差异大,特征难对齐问题。然后,提出包含多头注意力模型和属性监督函数的可解释模块,提升关联精度并输出可解释的关联结果。其中多头注意力模型让网络关注到舰船目标显著性区域,属性监督函数引导模型关注舰船图像中判别性属性特征,利用属性特征帮助网络解释输出关联结果的决策依据,并以量化的形式可视化属性特征对关联结果的贡献度。最后,利用知识蒸馏的思想减小全局关联模块和可解释模块输出特征距离的差异,使得网络实现精准关联并提供可解释的关联结果。在实验部分,构建了首个多源遥感舰船目标数据集,在该数据集上的测试结果显示本文算法不仅在关联精度上优于现有算法,同时能够为关联过程提供清晰和直观的可视化关联结果。

本文引用格式

熊振宇 , 崔亚奇 , 董凯 , 李孟洋 , 熊伟 . 基于属性引导的多源遥感舰船目标可解释融合关联网络[J]. 航空学报, 2023 , 44(22) : 627476 -627476 . DOI: 10.7527/S1000-6893.2022.27476

Abstract

Multi-source remote sensing ship target correlation, as an important means for early-stage large-scale early warning and detection, provides important information support for maritime situation research and judgment. Existing association algorithms face the problems of poor interpretability of association results, difficulty in measuring heterogeneous features and low accuracy of multi-source target association. In this paper, an interpretable fusion network based on attribute guidance is proposed to solve the problem of ship target association in multi-source remote sensing. Firstly, to solve the problem of large difference in multi-source image content and difficulty in feature alignment, a global association module is proposed, which uses the cross modal measurement loss function to map image features into the common space. Then, an interpretable module including the multi head attention model and the attribute supervision function is proposed to improve the correlation accuracy and output interpretable correlation results. The multi-head attention model makes the network pay attention to the salient region of ship targets, and the attribute supervision function enables the model to pay attention to the discriminant attribute features in ship images. Finally, the idea of knowledge distillation is used to reduce the difference between the output feature distance of the global correlation module and the interpretable module, so that the network can realize accurate correlation and provide interpretable correlation results. In the experimental part, this paper constructs the first multi-source remote sensing ship target data set. The test results on this data set show that this algorithm is not only better than the existing algorithms in correlation accuracy, but also can provide clear and intuitive visual correlation results for the correlation process.

参考文献

1 李红光, 于若男, 丁文锐. 基于深度学习的小目标检测研究进展[J]. 航空学报202142(7): 024691.
  LI H G, YU R N, DING W R. Research development of small object traching based on deep learning[J]. Acta Aeronautica et Astronautica Sinica202142(7): 024691 (in Chinese).
2 WANG N, LI B, WEI X X, et al. Ship detection in spaceborne infrared image based on lightweight CNN and multisource feature cascade decision[J]. IEEE Transactions on Geoscience and Remote Sensing202159(5): 4324-4339.
3 YOU Y N, RAN B H, MENG G, et al. OPD-net: Prow detection based on feature enhancement and improved regression model in optical remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing202159(7): 6121-6137.
4 LIU Z K, YUAN L, WENG L B, et al. A high resolution optical satellite image dataset for ship recognition and some new baselines[C]∥ Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods. SCITEPRESS - Science and Technology Publications, 2017: 324–331.
5 XIONG W, XIONG Z Y, CUI Y Q. An explainable attention network for fine-grained ship classification using remote-sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing202260: 1-14.
6 LI Y S, ZHANG Y J, HUANG X, et al. Learning source-invariant deep hashing convolutional neural networks for cross-source remote sensing image retrieval[J]. IEEE Transactions on Geoscience and Remote Sensing201856(11): 6521-6536.
7 XIONG W, XIONG Z Y, CUI Y Q, et al. A discriminative distillation network for cross-source remote sensing image retrieval[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing202013: 1234-1247.
8 XIONG W, LV Y F, ZHANG X H, et al. Learning to translate for cross-source remote sensing image retrieval[J]. IEEE Transactions on Geoscience and Remote Sensing202058(7): 4860-4874.
9 杨曦, 张鑫, 郭浩远, 等. 基于不变特征的多源遥感图像舰船目标检测算法[J]. 电子学报202250(4): 887-899.
  YANG X, ZHANG X, GUO H Y, et al. Invariant features based ship detection model for multi-source remote sensing images[J]. Acta Electronica Sinica202250(4): 887-899 (in Chinese).
10 YANG X, ZHANG X, WANG N N, et al. A robust one-stage detector for multiscale ship detection with complex background in massive SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing202260: 1-12.
11 YANG X, WANG Z H, ZHAO J Y, et al. FG-GAN: A fine-grained generative adversarial network for unsupervised SAR-to-optical image translation[J]. IEEE Transactions on Geoscience and Remote Sensing202260: 1-11.
12 谭大宁, 刘瑜, 姚力波, 等. 基于视觉注意力机制的多源遥感图像语义分割[J]. 信号处理202238(6): 1180-1191.
  TAN D N, LIU Y, YAO L B, et al. Semantic segmentation of multi-source remote sensing images based on visual attention mechanism[J]. Journal of Signal Processing202238(6): 1180-1191 (in Chinese).
13 SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization[C]∥ 2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2017: 618-626.
14 CHENG G, HAN J W, LU X Q. Remote sensing image scene classification: Benchmark and state of the art[J]. Proceedings of the IEEE2017105(10): 1865-1883.
15 ZHOU W X, NEWSAM S, LI C M, et al. PatternNet: A benchmark dataset for performance evaluation of remote sensing image retrieval[J]. ISPRS Journal of Photogrammetry and Remote Sensing2018145: 197-209.
16 SHAO Z F, YANG K, ZHOU W X. Performance evaluation of single-label and multi-label remote sensing image retrieval using a dense labeling dataset[J]. Remote Sensing201810(6): 964.
17 LU X Q, WANG B Q, ZHENG X T, et al. Exploring models and data for remote sensing image caption generation[J]. IEEE Transactions on Geoscience and Remote Sensing201856(4): 2183-2195.
18 GUO M, YUAN Y, LU X Q. Deep cross-modal retrieval for remote sensing image and audio[C]∥ 2018 10th IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS). Piscataway: IEEE Press, 2018: 1-7.
19 XIONG W, XIONG Z Y, ZHANG Y, et al. A deep cross-modality hashing network for SAR and optical remote sensing images retrieval[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing202013: 5284-5296.
20 ZHANG X H, LV Y F, YAO L B, et al. A new benchmark and an attribute-guided multilevel feature representation network for fine-grained ship classification in optical remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing202013: 1271-1285.
21 DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]∥ 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2009: 248-255.
22 HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]∥ 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 770-778.
23 MAATEN L, HINTON G. Visualizing data using t-SNE[J]. Journal of Machine Learning Research20089(8): 2579–2605.
24 ZHANG D Q, LI W J. Large-scale supervised multimodal hashing with semantic correlation maximization[C]∥Proceedings of the AAAI Conference on Artificial Intelligence, 2014.
25 XU X, SHEN F M, YANG Y, et al. Learning discriminative binary codes for large-scale cross-modal retrieval[J]. IEEE Transactions on Image Processing201726(5): 2494-2507.
26 JIANG Q Y, LI W J. Deep cross-modal hashing[C]∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 3270-3278.
27 CAO Y, LONG M S, WANG J M, et al. Deep visual-semantic hashing for cross-modal retrieval[C]∥ Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 1445-1454.
文章导航

/