电子电气工程与控制

基于距离损失提升航空图像语义分割研究

  • 马宇卓 ,
  • 任侃 ,
  • 李涛 ,
  • 陈钱
展开
  • 南京理工大学 电子工程与光电技术学院,南京 210094
.E-mail: k.ren@njust.edu.cn

收稿日期: 2025-09-12

  修回日期: 2025-10-07

  录用日期: 2025-10-10

  网络出版日期: 2025-10-17

基金资助

国家自然科学基金(62175111);江苏高校“青蓝工程”工程(2024)

Improving remote sensing image semantic segmentation based on distance loss

  • Yuzhuo MA ,
  • Kan REN ,
  • Tao LI ,
  • Qian CHEN
Expand
  • School of Electronic and Optical Engineering,Nanjing University of Science and Technology,Nanjing 210094,China
E-mail: k.ren@njust.edu.cn

Received date: 2025-09-12

  Revised date: 2025-10-07

  Accepted date: 2025-10-10

  Online published: 2025-10-17

Supported by

National Natural Science Foundation of China(62175111);Qing Lan Project(2024)

摘要

深度学习和计算机视觉技术的进展对航空遥感领域产生了深远的影响,使得对航空图像的分析变得更加高效。与常规图像相比,航空图像的目标边界更清晰明显、分布更规律,且具有更强的空间结构性。然而,当前的先进分割方法主要集中于利用复杂的特征提取器以捕捉更强的上下文关系,更多关注单像素分类准确度,这不仅对硬件要求较高,而且忽视了从结构层面进行边界对齐的问题。为了应对这一挑战,提出了一种创新的边界感知损失函数——Lossd,旨在提升航空遥感图像语义分割的性能,尤其在边界精度和目标分割一致性方面。创新性地将结构差异转化为损失,而非传统方法侧重关注的单像素准确性。此外,针对语义分割任务中常见的过切和少切问题,提出了有效的解决方案,并在3个大规模使用的数据集和3个基准模型上进行了广泛的实验验证。实验结果表明,文中方法在不修改原有模型的前提下,显著提升了模型的语义分割性能,在LoveDA上实现了55.8% mIoU(+1.6%),在Uavid上实现了70.8% mIoU(+0.8%),在Potsdam上实现了94.1% mF1(+0.7%),接近并部分超越了当前主流的方法。

本文引用格式

马宇卓 , 任侃 , 李涛 , 陈钱 . 基于距离损失提升航空图像语义分割研究[J]. 航空学报, 2026 , 47(8) : 332780 -332780 . DOI: 10.7527/S1000-6893.2025.32780

Abstract

The advancements in deep learning and computer vision technologies have had a profound impact on the field of air-borne remote sensing, making the analysis of aerial images more efficient. Compared to conventional images, the target boundaries in aerial images are clearer and more distinct, with more regular distributions and stronger spatial structure. However, current state-of-the-art segmentation methods mainly focus on utilizing complex feature extractors to capture stronger contextual relationships, placing more emphasis on single-pixel classification accuracy. This not only demands higher hardware requirements but also overlooks the issue of boundary alignment from a structural perspective. To address this challenge, we propose an innovative boundary-aware loss function, Lossd, designed to enhance the performance of semantic segmentation for aerial remote sensing images, particularly in terms of boundary precision and target segmentation consistency. We innovatively translate structural differences into a loss, unlike traditional methods that focus on single-pixel accuracy. Moreover, we propose an effective solution for the common over-segmentation and under-segmentation problems in semantic segmentation tasks. Extensive experimental validation has been conducted on three widely used large-scale datasets and three benchmark models. Experimental results show that our method significantly improves the semantic segmentation performance without modifying the original network. Specialty, our method achieves 55.8% mIoU (+1.6%) on LoveDA, 70.8% mIoU (+0.8%) on UAVid, and 94.1% mF1 (+0.7%) on Potsdam, reaching or partially surpassing the performance of mainstream approaches.

参考文献

[1] 罗旭东, 吴一全, 陈金林. 无人机航拍影像目标检测与语义分割的深度学习方法研究进展[J]. 航空学报202445(6): 028822.
  LUO X D, WU Y Q, CHEN J L. Research progress on deep learning methods for object detection and semantic segmentation in UAV aerial images[J]. Acta Aeronautica et Astronautica Sinica202445(6): 028822 (in Chinese).
[2] 江波, 屈若锟, 李彦冬, 等. 基于深度学习的无人机航拍目标检测研究综述[J]. 航空学报202142(4): 524519.
  JIANG B, QU R K, LI Y D, et al. Object detection in UAV imagery based on deep learning: Review[J]. Acta Aeronautica et Astronautica Sinica202142(4): 524519 (in Chinese).
[3] CHANG F Z, MA T Y, WANG D T, et al. Method for building segmentation and extraction from high-resolution remote sensing images based on improved YOLOv5ds[J]. PLOS One202520(3): e0317106.
[4] LIU Q H, KAMPFFMEYER M, JENSSEN R, et al. Multi-modal land cover mapping of remote sensing images using pyramid attention and gated fusion networks[J]. International Journal of Remote Sensing202243(9): 3509-3535.
[5] GHAMISI P, YOKOYA N, LI J, et al. Advances in hyperspectral image and signal processing: A comprehensive overview of the state of the art[J]. IEEE Geoscience and Remote Sensing Magazine20175(4): 37-78.
[6] LI A J, JIAO L C, ZHU H, et al. Multitask semantic boundary awareness network for remote sensing image segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing202260: 5400314.
[7] LIU R, MI L, CHEN Z Z. AFNet: Adaptive fusion network for remote sensing image semantic segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing202159(9): 7871-7886.
[8] LI Y X, HOU Q B, ZHENG Z H, et al. Large selective kernel network for remote sensing object detection[C]∥2023 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2024: 16748-16759.
[9] WANG L B, LI R, ZHANG C, et al. UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing2022190: 196-214.
[10] WANG D, ZHANG Q M, XU Y F, et al. Advancing plain vision transformer toward remote sensing foundation model[J]. IEEE Transactions on Geoscience and Remote Sensing202361: 5607315.
[11] LI R, ZHENG S Y, ZHANG C, et al. ABCNet: Attentive bilateral contextual network for efficient semantic segmentation of Fine-Resolution remotely sensed imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing2021181: 84-98.
[12] KERVADEC H, BOUCHTIBA J, DESROSIERS C, et al. Boundary loss for highly unbalanced segmentation[J]. Medical Image Analysis202167: 101851.
[13] ZHAO S, WANG Y, YANG Z, et al. Region mutual information loss for semantic segmentation[C]∥Proceedings of the 33rd International Conference on Neural Information Processing Systems. New York:ACM,2019:11117-11127.
[14] LI X L, CHEN J S, ZHAO L L, et al. Adaptive distance-weighted voronoi tessellation for remote sensing image segmentation[J]. Remote Sensing202012(24): 4115.
[15] BOKHOVKIN A, BURNAEV E. Boundary loss for remote sensing imagery semantic segmentation[C]∥Advances in Neural Networks-ISNN 2019. Cham: Springer, 2019: 388-401.
[16] GüL F, APTOULA E. A distance transform based loss function for the semantic segmentation of very high resolution remote sensing images[C]∥IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium. Piscataway: IEEE Press, 2024: 9888-9891.
[17] BERTASIUS G, SHI J B, TORRESANI L. Semantic segmentation with boundary neural fields[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 3602-3610.
[18] 文贡坚, 王润生. 从航空遥感图像中自动提取主要道路[J]. 软件学报200011(7): 957-964.
  WEN G J, WANG R S. Automatic extraction of main roads from aerial remote sensing images[J]. Journal of Software200011(7): 957-964 (in Chinese).
[19] SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence201739(4): 640-651
[20] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]∥Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015. Cham: Springer, 2015: 234-241.
[21] CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]∥Computer Vision-ECCV 2018. Cham: Springer, 2018: 833-851.
[22] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence201739(12): 2481-2495.
[23] SHEKHOVTSOV A, YANUSH V. Reintroducing straight-through estimators as principled methods for stochastic binary networks[C]∥Pattern Recognition (DAGM GCPR 2021). Cham: Springer, 2021: 111-126.
[24] WANG J J, ZHENG Z, MA A L, et al. LoveDA: A remote sensing land-cover dataset for domain adaptive semantic segmentation[DB/OL]. arXiv: , 2021.
[25] LYU Y, VOSSELMAN G, XIA G S, et al. UAVid: A semantic segmentation dataset for UAV imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing2020165: 108-119.
[26] ISPRS. Potsdam 2D semantic labeling dataset[DB/OL]. (2015-04-01)[2025-06-20]. .
[27] HWANG G, JEONG J, LEE S J. SFA-Net: Semantic feature adjustment network for remote sensing image segmentation[J]. Remote Sensing202416(17): 3278.
[28] WEI X Y, RAO L, FAN G Y, et al. MLFMNet: A multilevel feature mining network for semantic segmentation on aerial images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing202417: 16165-16179.
[29] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM201760(6): 84-90.
[30] CHENG B W, GIRSHICK R, DOLLáR P, et al. Boundary IoU: Improving object-centric image segmentation evaluation[C]∥2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2021: 15329-15337.
[31] MA A L, WANG J J, ZHONG Y F, et al. FactSeg: Foreground activation-driven small object semantic segmentation in large-scale remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing202260: 5606216.
[32] WANG L B, LI R, DUAN C X, et al. A novel transformer based semantic segmentation scheme for fine-resolution remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters202219: 6506105.
[33] CHEN Y X, FANG P C, ZHONG X L, et al. Hi-ResNet: Edge detail enhancement for high-resolution remote sensing segmentation[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing202417: 15024-15040.
[34] LI F, ZHANG H, XU H Z, et al. Mask DINO: Towards a unified transformer-based framework for object detection and segmentation[C]∥2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2023: 3041-3050.
[35] HüMMER C, SCHWONBERG M, ZHOU L W, et al. Strong but simple: A baseline for domain generalized dense perception by CLIP-based transfer learning[C]∥Computer Vision-ACCV 2024. Singapore: Springer, 2025: 463-484.
[36] MA X W, LIAN R R, WU Z K, et al. LOGCAN++: Adaptive local-global class-aware network for semantic segmentation of remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing202563: 4404216.
[37] HANYU T, YAMAZAKI K, TRAN M, et al. AerialFormer: Multi-resolution transformer for aerial image segmentation[J]. Remote Sensing202416(16): 2930.
[38] XIE E Z, WANG W H, YU Z D, et al. SegFormer: Simple and efficient design for semantic segmentation with transformers[DB/OL]. arXiv preprint: 2105.15203, 2021.
[39] WANG L B, LI R, WANG D Z, et al. Transformer meets convolution: A bilateral awareness network for semantic segmentation of very fine resolution urban scene images[J]. Remote Sensing202113(16): 3065.
[40] LU W, CHEN S B, SHU Q L, et al. DecoupleNet: A lightweight backbone network with efficient feature decoupling for remote sensing visual tasks[J]. IEEE Transactions on Geoscience and Remote Sensing202462: 4414613.
[41] SUN H, XIE Y C, REN D, et al. MMLN: Multi-directional and multi-constraint learning network for remote sensing imagery semantic segmentation[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing2024: 1-16.
[42] ALMARZOUQI H, SAAD SAOUD L. Semantic labeling of high-resolution images using Efficient UNets and transformers[J]. IEEE Transactions on Geoscience and Remote Sensing202361: 4402913.
[43] CHA K, SEO J, LEE T. A billion-scale foundation model for remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing2024:1-17.
文章导航

/