面向空天多模态感知的长尾分布船舶识别方法-信息融合大会会议增刊

doi:10.7527/S1000-6893.2025.32925

本期目录 | 过刊浏览 | 高级检索

前一篇 | 后一篇

面向空天多模态感知的长尾分布船舶识别方法-信息融合大会会议增刊

王世豪¹,徐政伟²,高龙³,徐从安⁴,林云¹

1. 哈尔滨工程大学
2. 河南师范大学
3. 海军航空大学
4. 海军航空工程学院

收稿日期:2025-10-16 修回日期:2025-10-31 出版日期:2025-11-07 发布日期:2025-11-07
通讯作者: 林云
基金资助:
国家自然科学基金;国家自然科学基金;泰山学者计划;中国博士后科学基金;中央高校基本科研业务费专项资金;国家自然科学基金

Long-Tailed Ship Recognition Method Based on Aerial–Space Multimodal Percep-tion

Received:2025-10-16 Revised:2025-10-31 Online:2025-11-07 Published:2025-11-07

摘要/Abstract

摘要： 在空天一体化海洋监测与智能海事管理中，利用无人机与卫星等空天平台采集的多源传感信息实现船舶目标识别，对航道管控、海上执法及边防预警具有重要意义。然而，真实场景下的船舶识别任务通常面临两大挑战：一是多模态数据的融合困难，可见光、辐射源信号不同模态之间存在异质性和时空不对齐问题；二是船舶类别天然呈现严重的长尾分布，头部类别占据大量样本，而尾部类别数据稀缺，严重影响整体识别性能。针对上述问题，本文提出一种面向空天多模态感知的长尾分布船舶识别方法。该方法融合类别感知边界优化策略、基于类别的重加权策略，有效提升尾部类别的判别能力与多模态融合的鲁棒性。实验结果表明，所提方法在典型长尾分布船舶识别任务中均取得了优于现有方法的性能，展现出良好的实用性与泛化能力。

关键词: 空天遥感, 多模态融合, 长尾识别, 船舶分类, 边界优化

Abstract: In the context of integrated aerial–space ocean monitoring and intelligent maritime management, ship target recognition based on multisource sensing data collected from unmanned aerial vehicles (UAVs) and satellites plays a crucial role in navigation control, maritime law enforcement, and border surveillance. However, real-world ship recognition tasks face two major challenges. First, multimodal data fusion is difficult due to the heterogeneity and spatiotemporal misalignment between different modalities, such as optical images and electromagnetic radiation signals. Second, ship categories naturally exhibit a severe long-tailed distribution, where head classes dominate the sample population while tail classes remain scarce, significantly degrading overall recognition performance. To address these challenges, this paper proposes a long-tailed ship recognition method oriented toward aerial–space multimodal per-ception. The proposed method integrates a class-aware boundary optimization strategy and a category-based reweighting mechanism, effectively enhancing the discriminative capability of tail classes and improving the robustness of multimodal fusion. Experimental results demonstrate that the proposed method consistently outperforms existing approaches on representative long-tailed ship recog-nition tasks, showing strong practicality and generalization capability.

Key words: Aerial–space remote sensing, Multimodal fusion, Long-tailed recognition, Ship classification, Boundary optimization

王世豪徐政伟高龙徐从安林云. 面向空天多模态感知的长尾分布船舶识别方法-信息融合大会会议增刊[J]. 航空学报, doi: 10.7527/S1000-6893.2025.32925.

参考文献

[1] Guo L, Wang Y, Liu Y, et al. Ultralight convolutional neural network for automatic modulation classification in internet of unmanned aerial vehicles[J]. IEEE Internet of Things Journal, 2024, 11(11): 20831-20839.
[2] Liang P P, Ling C K, Cheng Y, et al. Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications[C]//ICLR. 2024.
[3] 欧阳昱中,韩锐,刘驰.边缘侧领域自适应中长尾视觉识别技术研究[J/OL].计算机工程,1-10[2025-05-20]. https://doi.org/10.19678/j.issn.1000-3428.0069287.
[4] He X, Wang Y, Zhao S, et al. Co-attention fusion network for multimodal skin cancer diagnosis[J]. Pattern Recognition, 2023, 133: 108990.
[5] Lu Y, Zhao W, Sun N, et al. Enhancing multimodal knowledge graph representation learning through triple contrastive learning[C]//Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. 2024: 5963-5971.
[6] Zhang X, Demiris Y. Visible and infrared image fusion using deep learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(8): 10535-10554.
[7] Tu Y, Lin Y, Hou C, et al. Complex-valued networks for automatic modulation classification[J]. IEEE Transactions on Vehicular Technology, 2020, 69(9): 10085-10089.
[8] Guo L, Liu C, Liu Y, et al. Toward open-set specific emitter identification using auxiliary classifier generative adversarial network and OpenMax[J]. IEEE Transactions on Cognitive Communications and Networking, 2024, 10(6): 2019-2028.
[9] Zhang Y, Latham P E, Saxe A. Understanding unimodal bias in multimodal deep linear networks[J]. arXiv preprint arXiv:2312.00935, 2023.
[10] Wang J, Xu C, Zhao C, et al. Multimodal object detection of UAV remote sensing based on joint representation optimization and specific information enhancement[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 12364-12373.
[11] Huang C, Cai W, Jiang Q, et al. Multimodal representation distribution learning for medical image segmentation[C]//Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. 2024: 4156-4164.
[12] 郭浩, 李欣奕, 唐九阳, 等. 自适应特征融合的多模态实体对齐研究[J]. 自动化学报, 2024, 50(4): 758-770.
[13] Ma H, He D, Wang X, et al. Multi-modal sarcasm detection based on dual generative processes[C]//Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. 2024: 2279-2287.
[14] Zhang X, Yoon J, Bansal M, et al. Multimodal representation learning by alternating unimodal adaptation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2024: 27456-27466.
[15] 韩佳艺, 刘建伟, 陈德华, 等. 深度长尾学习研究综述[J]. 自动化学报, 2025, 51(5): 1-36.
[16] Choo Y H, Cai Z, Le V, et al. Multi-objective flexible job-shop scheduling with an ensemble optimisation model[C]//2022 IEEE Industrial Electronics and Applications Conference (IEACon). IEEE, 2022: 229-234.
[17] Zhang S, Li Z, Yan S, et al. Distribution alignment: A unified framework for long-tail visual recognition[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 2361-2370.
[18] Wang Q, Qu X, Jin P, et al. ODinMJ: A red, green, blue-thermal dataset for mountain jungle object detection[J]. IEEE Geoscience and Remote Sensing Magazine, 2024.
[19] 魏秀参, 许玉燕, 杨健. 网络监督数据下的细粒度图像识别综述[J]. 中国图象图形学报, 2022, 27(7): 2057-2077.
[20] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2980-2988.
[21] Cao K, Wei C, Gaidon A, et al. Learning imbalanced datasets with label-distribution-aware margin loss[J]. Advances in neural information processing systems, 2019, 32.
[22] Wang Q, Yin C, Song H, et al. UTFNet: Uncertainty-guided trustworthy fusion network for RGB-thermal semantic segmentation[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 1-5.

E-mail：hkxb@buaa.edu.cn

关于我们

期刊社服务

专业学科

封面文章

友情链接

主管单位：中国科学技术协会主办单位：中国航空学会北京航空航天大学

面向空天多模态感知的长尾分布船舶识别方法-信息融合大会会议增刊

Long-Tailed Ship Recognition Method Based on Aerial–Space Multimodal Percep-tion

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

编辑推荐

Metrics

本文评价