| [1] |
肖欣林, 施伟超, 郑向涛, 等. 基于多模型协同的舰船目标检测[J]. 航空学报, 2024, 45(14): 630241.
|
|
XIAO X L, SHI W C, ZHENG X T, et al. Multiple models collaboration for ship detection[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(14): 630241 (in Chinese).
|
| [2] |
赵其昌, 吴一全, 苑玉彬. 光学遥感图像舰船目标检测与识别方法研究进展[J]. 航空学报, 2024, 45(8): 029025.
|
|
ZHAO Q C, WU Y Q, YUAN Y B. Progress of ship detection and recognition methods in optical remote sensing images[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(8): 029025 (in Chinese).
|
| [3] |
WANG L B, FANG S H, MENG X L, et al. Building extraction with vision transformer[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5625711.
|
| [4] |
ODONGO R. Remote sensing applications in environmental monitoring[J]. European Journal of Natural Sciences, 2023, 1(1): 1-12.
|
| [5] |
MA A L, CHEN D Y, ZHONG Y F, et al. National-scale greenhouse mapping for high spatial resolution remote sensing imagery using a dense object dual-task deep learning framework: A case study of China[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 181: 279-294.
|
| [6] |
HONG D F, GAO L R, YOKOYA N, et al. More diverse means better: Multimodal deep learning meets remote-sensing imagery classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(5): 4340-4354.
|
| [7] |
KIEU N, NGUYEN K, NAZIB A, et al. Multimodal colearning meets remote sensing: Taxonomy, state of the art, and future works[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 7386-7409.
|
| [8] |
MA W L, KARAKUŞ O, ROSIN P L. AMM-FuseNet: Attention-based multi-modal image fusion network for land cover mapping[J]. Remote Sensing, 2022, 14(18): 4458.
|
| [9] |
GUO X, LAO J W, DANG B, et al. SkySense: A multi-modal remote sensing foundation model towards universal interpretation for earth observation imagery[C]∥2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 27662-27673.
|
| [10] |
RADFORD A, KIM J W, HALLACY C, et al. Learning transferable visual models from natural language supervision[C]∥International Conference on Machine Learning. PmLR, 2021: 8748-8763.
|
| [11] |
STOJNIC V, RISOJEVIC V. Self-supervised learning of remote sensing scene representations using contrastive multiview coding[C]∥2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway: IEEE Press, 2021: 1182-1191.
|
| [12] |
何友, 刘瑜, 李耀文, 等. 多源信息融合发展及展望[J]. 航空学报, 2025, 46(6): 531672.
|
|
HE Y, LIU Y, LI Y W, et al. Development and prospects of multisource information fusion[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(6): 531672 (in Chinese).
|
| [13] |
LI W J, YANG W, LIU T P, et al. Predicting gradient is better: Exploring self-supervised learning for SAR ATR with a joint-embedding predictive architecture[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2024, 218: 326-338.
|
| [14] |
GIRDHAR R, EL-NOUBY A, LIU Z, et al. ImageBind one embedding space to bind them all[C]∥2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2023: 15180-15190.
|
| [15] |
PREXL J, SCHMITT M. SenPa-MAE: Sensor parameter aware masked autoencoder for multi-satellite self-supervised pretraining[M]∥Pattern Recognition. ChamSpringer Nature Switzerland, 2025: 317-331.
|
| [16] |
XIONG Z T, WANG Y, ZHANG F H, et al. One for all: Toward unified foundation models for Earth vision[C]∥IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium. Piscataway: IEEE Press, 2024: 2734-2738.
|
| [17] |
HUANG Y, DU C Z, XUE Z H, et al. What makes multi-modal learning better than single (provably)[C]∥Proceedings of the 35th International Conference on Neural Information Processing Systems. New York: ACM, 2021: 10944-10956.
|
| [18] |
SUN X, TIAN Y, LU W X, et al. From single-to multi-modal remote sensing imagery interpretation: A survey and taxonomy[J]. Science China Information Sciences, 2023, 66(4): 140301.
|
| [19] |
WANG X, CHEN G Y, QIAN G W, et al. Large-scale multi-modal pre-trained models: A comprehensive survey[J]. Machine Intelligence Research, 2023, 20(4): 447-482.
|
| [20] |
FULLER A, GREEN J, MILLARD K. CROMA: Remote sensing representations with contrastive radar-optical masked autoencoders[C]∥Advances in Neural Information Processing Systems 36. New Orleans: Neural Information Processing Systems Foundation, Inc. (NeurIPS), 2023: 5506-5538.
|
| [21] |
WANG Y, ALBRECHT C M, BRAHAM N A ALI, et al. Decoupling common and unique representations for multimodal self-supervised learning[M]∥Computer Vision-ECCV 2024. Cham: Springer Nature Switzerland, 2024: 286-303.
|
| [22] |
LIANG P P, ZADEH A, MORENCY L P. Foundations & trends in multimodal machine learning: Principles, challenges, and open questions[J]. ACM Computing Surveys, 2024, 56(10): 1-42.
|
| [23] |
XIONG Z, WANG Y, ZHANG F, et al. Neural plasticity-inspired foundation model for observing the earth crossing modalities[DB/OL]. arXiv preprint: 2403.15356, 2024.
|
| [24] |
CAO B, GUO J L, ZHU P F, et al. Bi-directional adapter for multimodal tracking[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(2): 927-935.
|
| [25] |
ZHU J W, LAI S M, CHEN X, et al. Visual prompt multi-modal tracking[C]∥2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2023: 9516-9526.
|
| [26] |
CHEN T R, ZHU L Y, DING C T, et al. SAM-adapter: Adapting segment anything in underperformed scenes[C]∥2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Piscataway: IEEE Press, 2023: 3359-3367.
|
| [27] |
LI X, ZHANG G, CUI H, et al. MCANet: A joint semantic segmentation framework of optical and SAR images for land use classification[J]. International Journal of Applied Earth Observation and Geoinformation, 2022, 106: 102638.
|
| [28] |
TONG X Y, XIA G S, LU Q K, et al. Land-cover classification with high-resolution remote sensing images using transferable deep models[J]. Remote Sensing of Environment, 2020, 237: 111322.
|
| [29] |
LOSHCHILOV I, HUTTER F. Decoupled weight decay regularization[DB/OL]. arXiv preprint: 1711.05101, 2017.
|
| [30] |
ZHANG J M, LIU H Y, YANG K L, et al. CMX: Cross-modal fusion for RGB-X semantic segmentation with transformers[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(12): 14679-14694.
|
| [31] |
MA X P, XU X C, ZHANG X K, et al. Adjacent-scale multimodal fusion networks for semantic segmentation of remote sensing data[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 20116-20128.
|
| [32] |
MA X P, ZHANG X K, PUN M O, et al. A multilevel multimodal fusion transformer for remote sensing semantic segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5403215.
|
| [33] |
SCHEIBENREIF L, HANNA J, MOMMERT M, et al. Self-supervised vision transformers for land-cover segmentation and classification[C]∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway: IEEE Press, 2022: 1421-1430.
|
| [34] |
STRUDEL R, GARCIA R, LAPTEV I, et al. Segmenter: Transformer for semantic segmentation[C]∥ 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2021: 7242-7252.
|
| [35] |
HAN B R, ZHANG S, SHI X J, et al. Bridging remote sensors with multisensor geospatial foundation models[C]∥2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 27852-27862.
|