[1] L. Wang, Y. Li, J. Huang and S. Lazebnik, "Learning Two-Branch Neural Networks for Image-Text Matching Tasks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 2, pp. 394-407, 1 Feb. 2019.[2] M. Yang et al., "Multitask Learning for Cross-Domain Image Captioning," IEEE Transactions on Multimedia, vol. 21, no. 4, pp. 1047-1061, April 2019.[3] X. Min, G. Zhai, J. Zhou, X. Zhang, X. Yang and X. Guan, "A Multimodal Saliency Model for Videos With High Audio-Visual Correspondence," IEEE Transactions on Image Processing, vol. 29, pp. 3805-3819, 2020.[4] S. Parekh, S. Essid, A. Ozerov, N. Q. K. Duong, P. Pérez and G. Richard, "Weakly Supervised Representation Learning for Audio-Visual Scene Analysis," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 416-428, 2020.[5] X. Xiang, N. Lv, Z. Yu, M. Zhai and A. El Saddik, "Cross-Modality Person Re-Identification Based on Dual-Path Multi-Branch Network," IEEE Sensors Journal, vol. 19, no. 23, pp. 11706-11713, 1 Dec.1, 2019.[6] A. Wu, W. Zheng, H. Yu, S. Gong and J. Lai, "RGB-Infrared Cross-Modality Person Re-identification," IEEE International Conference on Computer Vision (ICCV), Venice, pp. 5390-5399, 2017.[7] Y. Li, Y. Zhang, X. Huang et al., “Learning Source-Invariant Deep Hashing Convolutional Neural Networks for Cross-Source Remote Sensing Image Retrieval,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 11, pp. 6521-6536, 2018.[8] W. Xiong, Z. Xiong, Y. Cui and Y. Lv, "A Discriminative Distillation Network for Cross-Source Remote Sensing Image Retrieval," IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 13, pp. 1234-1247, 2020.[9] W. Xiong, Y. Lv, X. Zhang and Y. Cui, "Learning to Translate for Cross-Source Remote Sensing Image Retrieval," IEEE Transactions on Geoscience and Remote Sensing, doi: 10.1109/TGRS.2020.2968096.[10] Q. Y. Jiang, W. J. Li, “Deep Cross-Modal Hashing.” In Proc. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 3270-3278. 2017.[11] M. Gou, Y. Yuan, X. Lu, “Deep Cross-Modal Retrieval for Remote Sensing Image and Audio.” In 2018 10th IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS). IEEE, pp. 1–7, 2018.[12] X. Lu, B. Wang, X. Zheng, X. Li, “Exploring Models and Data for Remote Sensing Image Caption Generation.” IEEE Trans. Geosci. Remote Sens. vol. 2, no. 8, pp. 2183 - 2195, 2017.[13] M. Schmitt, L. H. Hughes, and X. X. Zhu. “The sen1-2 dataset for deep learning in sar-optical data fusion.” ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Seiten 141-146. ISPRS TCI Symposium 2018, 10- 12, Karlsruhe, Germany, Okt. 2018.[14] R. Torres, P. Snoeij, D. Geudtner and D. Bibby. et al., “GMES Sentinel-1 mission.” Remote Sensing of Environment 120, pp. 9–24, 2012.[15] M. Drusch, U. Del, S. Carlier and O. Colin et al., “Sentinel-2: ESA’s optical high-resolution mission for GMES operational services.” Remote sensing of Environment 120, pp. 25–36, 2012. |