收稿日期:2025-09-12
修回日期:2025-10-07
接受日期:2025-10-10
出版日期:2025-10-30
发布日期:2025-10-17
通讯作者:
任侃
E-mail:k.ren@njust.edu.cn
基金资助:
Yuzhuo MA, Kan REN(
), Tao LI, Qian CHEN
Received:2025-09-12
Revised:2025-10-07
Accepted:2025-10-10
Online:2025-10-30
Published:2025-10-17
Contact:
Kan REN
E-mail:k.ren@njust.edu.cn
Supported by:摘要:
深度学习和计算机视觉技术的进展对航空遥感领域产生了深远的影响,使得对航空图像的分析变得更加高效。与常规图像相比,航空图像的目标边界更清晰明显、分布更规律,且具有更强的空间结构性。然而,当前的先进分割方法主要集中于利用复杂的特征提取器以捕捉更强的上下文关系,更多关注单像素分类准确度,这不仅对硬件要求较高,而且忽视了从结构层面进行边界对齐的问题。为了应对这一挑战,提出了一种创新的边界感知损失函数——Lossd,旨在提升航空遥感图像语义分割的性能,尤其在边界精度和目标分割一致性方面。创新性地将结构差异转化为损失,而非传统方法侧重关注的单像素准确性。此外,针对语义分割任务中常见的过切和少切问题,提出了有效的解决方案,并在3个大规模使用的数据集和3个基准模型上进行了广泛的实验验证。实验结果表明,文中方法在不修改原有模型的前提下,显著提升了模型的语义分割性能,在LoveDA上实现了55.8% mIoU(+1.6%),在Uavid上实现了70.8% mIoU(+0.8%),在Potsdam上实现了94.1% mF1(+0.7%),接近并部分超越了当前主流的方法。
中图分类号:
马宇卓, 任侃, 李涛, 陈钱. 基于距离损失提升航空图像语义分割研究[J]. 航空学报, 2026, 47(8): 332780.
Yuzhuo MA, Kan REN, Tao LI, Qian CHEN. Improving remote sensing image semantic segmentation based on distance loss[J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(8): 332780.
表2
在LoveDA数据集上各方法对比
| 类型 | 方法 | 来源 | mIoU/% | IoU pre category/% | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 建筑 | 道路 | 水 | 荒地 | 农田 | 背景 | 森林 | ||||
方法 其他先进 | HRNetw32[ | NeurlPS 2021 | 49.8 | 55.3 | 57.4 | 80.0 | 11.1 | 60.9 | 44.6 | 45.2 |
| FactSeg[ | TGRS 2022 | 48.9 | 53.6 | 52.8 | 76.9 | 16.2 | 57.5 | 42.6 | 42.9 | |
| DC-Swin[ | GRSL 2022 | 50.6 | 54.5 | 56.2 | 78.1 | 14.5 | 62.4 | 41.3 | 47.2 | |
| Hi-ResNet[ | JSTARS 2023 | 52.5 | 58.3 | 55.9 | 80.1 | 17.0 | 62.7 | 46.7 | 46.7 | |
| Mask DINO[ | CVPR 2023 | 52.6 | 60.0 | 55.1 | 79.8 | 20.3 | 62.7 | 44.9 | 46.2 | |
| VLTSeg[ | ACCV 2024 | 53.8 | 57.9 | 61.3 | 80.5 | 24.1 | 60.2 | 45.8 | 46.5 | |
| LOGCAN[ | TGRS 2024 | 53.4 | 58.4 | 56.5 | 80.1 | 18.4 | 56.8 | 47.4 | 47.9 | |
| AerialFormer-B[ | RS 2024 | 54.1 | 60.7 | 59.3 | 81.5 | 17.9 | 64.0 | 47.8 | 47.9 | |
| 基准方法 | UNetFormer | ISPRS 2022 | 51.9 | 57.9 | 54.1 | 79.1 | 19.8 | 62.3 | 44.2 | 45.7 |
| MLFMNet-B[ | JSTARS 2024 | 53.1 | 60.8 | 57.2 | 81.3 | 17.5 | 61.6 | 45.8 | 47.4 | |
| SFA-Net | RS 2024 | 54.2 | 61 | 57.8 | 81.4 | 21.5 | 64.8 | 47.3 | 45.8 | |
| Lossd增强方法 | UNetFormer + Ours | 53.4 (+1.5) | 62.1 | 57.5 | 81.8 | 20.3 | 63.0 | 44.9 | 46.1 | |
| MLFMNet-B + Ours | 54.3 (+1.2) | 65.5 | 60.5 | 82.8 | 20.0 | 62.8 | 46.3 | 47.7 | ||
| SFA-Net + Ours | 55.8 (+1.6) | 65.4 | 60.7 | 83.2 | 21.9 | 65.7 | 47.4 | 46.5 | ||
表3
在UAVid数据集上各方法对比
| 类型 | 方法 | 来源 | mIoU/% | IoU pre category/% | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 建筑 | 道路 | 树 | 植被 | 运动车 | 静态车 | 行人 | 背景 | ||||
方法 其他先进 | SegFomer[ | NeurlPS 2021 | 65.4 | 85.4 | 79.9 | 78.5 | 61.8 | 71.8 | 52.1 | 27.8 | 66.3 |
| BANet[ | RS 2021 | 66.0 | 84.5 | 80.0 | 78.3 | 61.3 | 58.8 | 52.2 | 19.9 | 66.3 | |
| DC-Swin[ | GRSL 2022 | 68.8 | 88.5 | 82.7 | 80.0 | 64.6 | 74.1 | 59.3 | 30.7 | 70.3 | |
| DecoupleNet D2[ | TGRS 2024 | 65.1 | 84.4 | 79.9 | 78.2 | 61.3 | 73.6 | 48.8 | 30.2 | 64.6 | |
| MMLN[ | JSTARS 2024 | 69.5 | 88.4 | 81.9 | 80.9 | 65.7 | 62.0 | 74.8 | 32.5 | 69.6 | |
| Mask DINO[ | CVPR 2023 | 67.9 | 87.3 | 81.5 | 80.2 | 63.7 | 73.6 | 56.2 | 31.0 | 68.6 | |
| VLTSeg[ | ACCV 2024 | 69.5 | 89.6 | 83.3 | 80.7 | 65.1 | 74.9 | 59.7 | 31.9 | 70.6 | |
基准 方法 | UNetFormer | ISPRS 2022 | 67.4 | 87.2 | 81.1 | 79.8 | 63.1 | 73.3 | 55.9 | 30.6 | 68.2 |
| MLFMNet-B | JSTARS 2024 | 70.0 | 89.5 | 82.2 | 81.0 | 64.3 | 76.1 | 64.7 | 32.3 | 70.0 | |
| SFA-Net | RS 2024 | 69.9 | 88.7 | 82.4 | 80.4 | 64 | 77.1 | 66.9 | 30.2 | 69.7 | |
方法 Lossd增强 | UNetFormer + Ours | 67.8 (+0.4) | 88.1 | 81.9 | 80.2 | 63.3 | 73.5 | 56.2 | 30.9 | 68.5 | |
| MLFMNet-B + Ours | 70.8 (+0.8) | 91.4 | 83.5 | 81.4 | 64.9 | 76.4 | 65.6 | 32.6 | 70.3 | ||
| SFA-Net + Ours | 70.5 (+0.6) | 90.0 | 83.8 | 80.9 | 64.7 | 77.2 | 67.3 | 30.3 | 70.1 | ||
表4
在Potsdam数据集上各方法对比
| 类型 | 方法 | 来源 | mF1/% | F1 pre category/% | ||||
|---|---|---|---|---|---|---|---|---|
| 硬化面 | 建筑 | 低矮植被 | 树 | 汽车 | ||||
方法 其他先进 | DC-Swin[ | GRSL 2022 | 93.3 | 94.2 | 97.6 | 88.6 | 89.6 | 96.3 |
| EfficientUNets[ | TGRS 2023 | 93.5 | 94.8 | 98.2 | 89.5 | 90.5 | 94.6 | |
| Mask DINO[ | CVPR 2023 | 93.2 | 94.1 | 96.9 | 89.5 | 88.7 | 96.8 | |
| VLTSeg[ | ACCV 2024 | 93.8 | 95.2 | 97.4 | 89.3 | 89.2 | 98.0 | |
| Vit-G12X4[ | JSTARS 2024 | 92.1 | 92.8 | 96.9 | 85.9 | 89.0 | 96.0 | |
| AerialFormer-B[ | RS 2024 | 94.1 | 95.5 | 98.1 | 89.8 | 89.8 | 97.5 | |
| 基准方法 | UNetFormer | ISPRS 2022 | 92.3 | 93.1 | 96.7 | 87.4 | 88.5 | 96.0 |
| MLFMNet-B | JSTARS 2024 | 93.4 | 94.6 | 97.5 | 88.4 | 88.7 | 96.9 | |
| SFA-Net | RS 2024 | 93.2 | 94.6 | 97.2 | 88.1 | 89.5 | 96.7 | |
方法 Lossd 增强 | UNetFormer + Ours | 93.1 (+0.8) | 94.4 | 98.3 | 87.7 | 88.9 | 96.4 | |
| MLFMNet-B + Ours | 94.1 (+0.7) | 96.1 | 98.9 | 89.2 | 89.3 | 97.2 | ||
| SFA-Net + Ours | 94.0 (+0.8) | 96.2 | 98.4 | 88.6 | 90.1 | 96.9 | ||
| [1] | 罗旭东, 吴一全, 陈金林. 无人机航拍影像目标检测与语义分割的深度学习方法研究进展[J]. 航空学报, 2024, 45(6): 028822. |
| LUO X D, WU Y Q, CHEN J L. Research progress on deep learning methods for object detection and semantic segmentation in UAV aerial images[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(6): 028822 (in Chinese). | |
| [2] | 江波, 屈若锟, 李彦冬, 等. 基于深度学习的无人机航拍目标检测研究综述[J]. 航空学报, 2021, 42(4): 524519. |
| JIANG B, QU R K, LI Y D, et al. Object detection in UAV imagery based on deep learning: Review[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(4): 524519 (in Chinese). | |
| [3] | CHANG F Z, MA T Y, WANG D T, et al. Method for building segmentation and extraction from high-resolution remote sensing images based on improved YOLOv5ds[J]. PLOS One, 2025, 20(3): e0317106. |
| [4] | LIU Q H, KAMPFFMEYER M, JENSSEN R, et al. Multi-modal land cover mapping of remote sensing images using pyramid attention and gated fusion networks[J]. International Journal of Remote Sensing, 2022, 43(9): 3509-3535. |
| [5] | GHAMISI P, YOKOYA N, LI J, et al. Advances in hyperspectral image and signal processing: A comprehensive overview of the state of the art[J]. IEEE Geoscience and Remote Sensing Magazine, 2017, 5(4): 37-78. |
| [6] | LI A J, JIAO L C, ZHU H, et al. Multitask semantic boundary awareness network for remote sensing image segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5400314. |
| [7] | LIU R, MI L, CHEN Z Z. AFNet: Adaptive fusion network for remote sensing image semantic segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(9): 7871-7886. |
| [8] | LI Y X, HOU Q B, ZHENG Z H, et al. Large selective kernel network for remote sensing object detection[C]∥2023 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2024: 16748-16759. |
| [9] | WANG L B, LI R, ZHANG C, et al. UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190: 196-214. |
| [10] | WANG D, ZHANG Q M, XU Y F, et al. Advancing plain vision transformer toward remote sensing foundation model[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5607315. |
| [11] | LI R, ZHENG S Y, ZHANG C, et al. ABCNet: Attentive bilateral contextual network for efficient semantic segmentation of Fine-Resolution remotely sensed imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 181: 84-98. |
| [12] | KERVADEC H, BOUCHTIBA J, DESROSIERS C, et al. Boundary loss for highly unbalanced segmentation[J]. Medical Image Analysis, 2021, 67: 101851. |
| [13] | ZHAO S, WANG Y, YANG Z, et al. Region mutual information loss for semantic segmentation[C]∥Proceedings of the 33rd International Conference on Neural Information Processing Systems. New York:ACM,2019:11117-11127. |
| [14] | LI X L, CHEN J S, ZHAO L L, et al. Adaptive distance-weighted voronoi tessellation for remote sensing image segmentation[J]. Remote Sensing, 2020, 12(24): 4115. |
| [15] | BOKHOVKIN A, BURNAEV E. Boundary loss for remote sensing imagery semantic segmentation[C]∥Advances in Neural Networks-ISNN 2019. Cham: Springer, 2019: 388-401. |
| [16] | GÜL F, APTOULA E. A distance transform based loss function for the semantic segmentation of very high resolution remote sensing images[C]∥IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium. Piscataway: IEEE Press, 2024: 9888-9891. |
| [17] | BERTASIUS G, SHI J B, TORRESANI L. Semantic segmentation with boundary neural fields[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 3602-3610. |
| [18] | 文贡坚, 王润生. 从航空遥感图像中自动提取主要道路[J]. 软件学报, 2000, 11(7): 957-964. |
| WEN G J, WANG R S. Automatic extraction of main roads from aerial remote sensing images[J]. Journal of Software, 2000, 11(7): 957-964 (in Chinese). | |
| [19] | SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640-651 |
| [20] | RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]∥Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015. Cham: Springer, 2015: 234-241. |
| [21] | CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]∥Computer Vision-ECCV 2018. Cham: Springer, 2018: 833-851. |
| [22] | BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. |
| [23] | SHEKHOVTSOV A, YANUSH V. Reintroducing straight-through estimators as principled methods for stochastic binary networks[C]∥Pattern Recognition (DAGM GCPR 2021). Cham: Springer, 2021: 111-126. |
| [24] | WANG J J, ZHENG Z, MA A L, et al. LoveDA: A remote sensing land-cover dataset for domain adaptive semantic segmentation[DB/OL]. arXiv: , 2021. |
| [25] | LYU Y, VOSSELMAN G, XIA G S, et al. UAVid: A semantic segmentation dataset for UAV imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 165: 108-119. |
| [26] | ISPRS. Potsdam 2D semantic labeling dataset[DB/OL]. (2015-04-01)[2025-06-20]. . |
| [27] | HWANG G, JEONG J, LEE S J. SFA-Net: Semantic feature adjustment network for remote sensing image segmentation[J]. Remote Sensing, 2024, 16(17): 3278. |
| [28] | WEI X Y, RAO L, FAN G Y, et al. MLFMNet: A multilevel feature mining network for semantic segmentation on aerial images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 16165-16179. |
| [29] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. |
| [30] | CHENG B W, GIRSHICK R, DOLLÁR P, et al. Boundary IoU: Improving object-centric image segmentation evaluation[C]∥2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2021: 15329-15337. |
| [31] | MA A L, WANG J J, ZHONG Y F, et al. FactSeg: Foreground activation-driven small object semantic segmentation in large-scale remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5606216. |
| [32] | WANG L B, LI R, DUAN C X, et al. A novel transformer based semantic segmentation scheme for fine-resolution remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 6506105. |
| [33] | CHEN Y X, FANG P C, ZHONG X L, et al. Hi-ResNet: Edge detail enhancement for high-resolution remote sensing segmentation[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 15024-15040. |
| [34] | LI F, ZHANG H, XU H Z, et al. Mask DINO: Towards a unified transformer-based framework for object detection and segmentation[C]∥2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2023: 3041-3050. |
| [35] | HÜMMER C, SCHWONBERG M, ZHOU L W, et al. Strong but simple: A baseline for domain generalized dense perception by CLIP-based transfer learning[C]∥Computer Vision-ACCV 2024. Singapore: Springer, 2025: 463-484. |
| [36] | MA X W, LIAN R R, WU Z K, et al. LOGCAN++: Adaptive local-global class-aware network for semantic segmentation of remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 4404216. |
| [37] | HANYU T, YAMAZAKI K, TRAN M, et al. AerialFormer: Multi-resolution transformer for aerial image segmentation[J]. Remote Sensing, 2024, 16(16): 2930. |
| [38] | XIE E Z, WANG W H, YU Z D, et al. SegFormer: Simple and efficient design for semantic segmentation with transformers[DB/OL]. arXiv preprint: 2105.15203, 2021. |
| [39] | WANG L B, LI R, WANG D Z, et al. Transformer meets convolution: A bilateral awareness network for semantic segmentation of very fine resolution urban scene images[J]. Remote Sensing, 2021, 13(16): 3065. |
| [40] | LU W, CHEN S B, SHU Q L, et al. DecoupleNet: A lightweight backbone network with efficient feature decoupling for remote sensing visual tasks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 4414613. |
| [41] | SUN H, XIE Y C, REN D, et al. MMLN: Multi-directional and multi-constraint learning network for remote sensing imagery semantic segmentation[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024: 1-16. |
| [42] | ALMARZOUQI H, SAAD SAOUD L. Semantic labeling of high-resolution images using Efficient UNets and transformers[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 4402913. |
| [43] | CHA K, SEO J, LEE T. A billion-scale foundation model for remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024:1-17. |
| [1] | 黄俊, 张菁, 翁世倩. 机载光电目标识别算法综述[J]. 航空学报, 2026, 47(6): 332601-332601. |
| [2] | 李乐言, 杨任农, 郭安新, 宋祺, 左家亮. 基于全域火力场的超视距空战威胁预测及动态逃逸方法[J]. 航空学报, 2026, 47(4): 332205-332205. |
| [3] | 冯子成, 张文龙, 刘冬辉, 于起峰. 复杂背景下反无人机红外目标鲁棒跟踪算法[J]. 航空学报, 2026, 47(4): 332264-332264. |
| [4] | 陶冶, 汤锦辉, 闫震, 周臣, 王冲. 融合表征转换与模式回归的航迹插补方法[J]. 航空学报, 2026, 47(1): 332106-332106. |
| [5] | 徐建宇, 周莉, 王占学, 是介, 史毫. 基于快速逐线计算模型的高超声速羽流红外辐射计算方法[J]. 航空学报, 2025, 46(8): 630778-630778. |
| [6] | 孟令捷, 李红光, 李新军. 基于地貌类别信息指导的SAR图像仿真方法[J]. 航空学报, 2025, 46(7): 331003-331003. |
| [7] | 赵志浩, 杨照华, 吴云, 余远金. 弱光环境下基于深度学习的单光子计数成像去噪方法[J]. 航空学报, 2025, 46(3): 630531-630531. |
| [8] | 吴一全, 童康. 基于深度学习的无人机航拍图像小目标检测研究进展[J]. 航空学报, 2025, 46(3): 30848-030848. |
| [9] | 项子健, 麻震宇, 杨希祥. 基于深度学习的复合材料结构性能参数反演[J]. 航空学报, 2025, 46(24): 231877-231877. |
| [10] | 姚兆汝, 孙航, 任东, 刘莉, 万俊. 双向映射与物理负相关的航空图像去雾算法[J]. 航空学报, 2025, 46(23): 631652-631652. |
| [11] | 丛润民, 孙豪言, 罗宇轩, 方豪. 基于类关系挖掘的遥感图像广义小样本分割方法[J]. 航空学报, 2025, 46(23): 631694-631694. |
| [12] | 范天麒, 邹征夏, 史振威. 基于强化学习数据合成的典型遥感目标检测[J]. 航空学报, 2025, 46(23): 631955-631955. |
| [13] | 刘奎, 孙浩, 伍瀚, 计科峰, 匡纲要. 动态亮度重建的无人机可见光-红外融合目标检测[J]. 航空学报, 2025, 46(23): 631968-631968. |
| [14] | 彭勃, 白吉康, 陈伟文, 郑向涛, 雷建军, 卢孝强. 基于深度学习的无人机搜救方法研究进展[J]. 航空学报, 2025, 46(23): 632761-632761. |
| [15] | 李嘉欣, 吕帅帅, 王叶子, 杨宇, 李梓悦. 基于Transformer的航空结构表面裂纹智能追踪方法[J]. 航空学报, 2025, 46(21): 532355-532355. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||
版权所有 © 航空学报编辑部
版权所有 © 2011航空学报杂志社
主管单位:中国科学技术协会 主办单位:中国航空学会 北京航空航天大学

