Due to the different imaging mechanism of visible and SAR images, the content of images is different, the depth features are difficult to align and the correlation speed is slow. A depth cross modal hash network model is proposed to realize the cross modal correlation between SAR images and optical images. Firstly, aiming at the great difference of color information between SAR and optical remote sensing images, an image transformation mechanism is proposed. Four different types of spectral images are generated from optical image conversion and input into the network, which disrupts the color channel, so that the network pays more attention to the texture and contour information of the image, but is not sensitive to the color information; secondly, aiming at the high noise of SAR image, two modal images in the same scene are generated. For example, the content is heterogeneous, image pair training strategy is proposed to reduce the feature difference between cross-modal images; then, aiming at the low correlation efficiency and high storage consumption, a triple hash loss function is proposed to improve the association accuracy of the model and reduce the association time. Finally, a SAR and optical dual-mode remote sensing data set is constructed to make up for the lack of data in this field. The experimental part verifies the practicability of the data set and the effectiveness of the proposed algorithm.
WANG Zi-Ling
,
XIONG Zhen-Yu
,
GU Xiang-Qi
. Correlation learning algorithm of visible light and SAR cross modal remote sensing images[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 0
: 0
-0
.
DOI: 10.7527/S1000-6893.2021.27239
[1] L. Wang, Y. Li, J. Huang and S. Lazebnik, "Learning Two-Branch Neural Networks for Image-Text Matching Tasks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 2, pp. 394-407, 1 Feb. 2019.
[2] M. Yang et al., "Multitask Learning for Cross-Domain Image Captioning," IEEE Transactions on Multimedia, vol. 21, no. 4, pp. 1047-1061, April 2019.
[3] X. Min, G. Zhai, J. Zhou, X. Zhang, X. Yang and X. Guan, "A Multimodal Saliency Model for Videos With High Audio-Visual Correspondence," IEEE Transactions on Image Processing, vol. 29, pp. 3805-3819, 2020.
[4] S. Parekh, S. Essid, A. Ozerov, N. Q. K. Duong, P. Pérez and G. Richard, "Weakly Supervised Representation Learning for Audio-Visual Scene Analysis," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 416-428, 2020.
[5] X. Xiang, N. Lv, Z. Yu, M. Zhai and A. El Saddik, "Cross-Modality Person Re-Identification Based on Dual-Path Multi-Branch Network," IEEE Sensors Journal, vol. 19, no. 23, pp. 11706-11713, 1 Dec.1, 2019.
[6] A. Wu, W. Zheng, H. Yu, S. Gong and J. Lai, "RGB-Infrared Cross-Modality Person Re-identification," IEEE International Conference on Computer Vision (ICCV), Venice, pp. 5390-5399, 2017.
[7] Y. Li, Y. Zhang, X. Huang et al., “Learning Source-Invariant Deep Hashing Convolutional Neural Networks for Cross-Source Remote Sensing Image Retrieval,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 11, pp. 6521-6536, 2018.
[8] W. Xiong, Z. Xiong, Y. Cui and Y. Lv, "A Discriminative Distillation Network for Cross-Source Remote Sensing Image Retrieval," IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 13, pp. 1234-1247, 2020.
[9] W. Xiong, Y. Lv, X. Zhang and Y. Cui, "Learning to Translate for Cross-Source Remote Sensing Image Retrieval," IEEE Transactions on Geoscience and Remote Sensing, doi: 10.1109/TGRS.2020.2968096.
[10] Q. Y. Jiang, W. J. Li, “Deep Cross-Modal Hashing.” In Proc. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 3270-3278. 2017.
[11] M. Gou, Y. Yuan, X. Lu, “Deep Cross-Modal Retrieval for Remote Sensing Image and Audio.” In 2018 10th IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS). IEEE, pp. 1–7, 2018.
[12] X. Lu, B. Wang, X. Zheng, X. Li, “Exploring Models and Data for Remote Sensing Image Caption Generation.” IEEE Trans. Geosci. Remote Sens. vol. 2, no. 8, pp. 2183 - 2195, 2017.
[13] M. Schmitt, L. H. Hughes, and X. X. Zhu. “The sen1-2 dataset for deep learning in sar-optical data fusion.” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Seiten 141-146. ISPRS TCI Symposium 2018, 10- 12, Karlsruhe, Germany, Okt. 2018.
[14] R. Torres, P. Snoeij, D. Geudtner and D. Bibby. et al., “GMES Sentinel-1 mission.” Remote Sensing of Environment 120, pp. 9–24, 2012.
[15] M. Drusch, U. Del, S. Carlier and O. Colin et al., “Sentinel-2: ESA’s optical high-resolution mission for GMES operational services.” Remote sensing of Environment 120, pp. 25–36, 2012.