ACTA AERONAUTICAET ASTRONAUTICA SINICA >
Improving remote sensing image semantic segmentation based on distance loss
Received date: 2025-09-12
Revised date: 2025-10-07
Accepted date: 2025-10-10
Online published: 2025-10-17
Supported by
National Natural Science Foundation of China(62175111);Qing Lan Project(2024)
The advancements in deep learning and computer vision technologies have had a profound impact on the field of air-borne remote sensing, making the analysis of aerial images more efficient. Compared to conventional images, the target boundaries in aerial images are clearer and more distinct, with more regular distributions and stronger spatial structure. However, current state-of-the-art segmentation methods mainly focus on utilizing complex feature extractors to capture stronger contextual relationships, placing more emphasis on single-pixel classification accuracy. This not only demands higher hardware requirements but also overlooks the issue of boundary alignment from a structural perspective. To address this challenge, we propose an innovative boundary-aware loss function, Lossd, designed to enhance the performance of semantic segmentation for aerial remote sensing images, particularly in terms of boundary precision and target segmentation consistency. We innovatively translate structural differences into a loss, unlike traditional methods that focus on single-pixel accuracy. Moreover, we propose an effective solution for the common over-segmentation and under-segmentation problems in semantic segmentation tasks. Extensive experimental validation has been conducted on three widely used large-scale datasets and three benchmark models. Experimental results show that our method significantly improves the semantic segmentation performance without modifying the original network. Specialty, our method achieves 55.8% mIoU (+1.6%) on LoveDA, 70.8% mIoU (+0.8%) on UAVid, and 94.1% mF1 (+0.7%) on Potsdam, reaching or partially surpassing the performance of mainstream approaches.
Yuzhuo MA , Kan REN , Tao LI , Qian CHEN . Improving remote sensing image semantic segmentation based on distance loss[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2026 , 47(8) : 332780 -332780 . DOI: 10.7527/S1000-6893.2025.32780
| [1] | 罗旭东, 吴一全, 陈金林. 无人机航拍影像目标检测与语义分割的深度学习方法研究进展[J]. 航空学报, 2024, 45(6): 028822. |
| LUO X D, WU Y Q, CHEN J L. Research progress on deep learning methods for object detection and semantic segmentation in UAV aerial images[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(6): 028822 (in Chinese). | |
| [2] | 江波, 屈若锟, 李彦冬, 等. 基于深度学习的无人机航拍目标检测研究综述[J]. 航空学报, 2021, 42(4): 524519. |
| JIANG B, QU R K, LI Y D, et al. Object detection in UAV imagery based on deep learning: Review[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(4): 524519 (in Chinese). | |
| [3] | CHANG F Z, MA T Y, WANG D T, et al. Method for building segmentation and extraction from high-resolution remote sensing images based on improved YOLOv5ds[J]. PLOS One, 2025, 20(3): e0317106. |
| [4] | LIU Q H, KAMPFFMEYER M, JENSSEN R, et al. Multi-modal land cover mapping of remote sensing images using pyramid attention and gated fusion networks[J]. International Journal of Remote Sensing, 2022, 43(9): 3509-3535. |
| [5] | GHAMISI P, YOKOYA N, LI J, et al. Advances in hyperspectral image and signal processing: A comprehensive overview of the state of the art[J]. IEEE Geoscience and Remote Sensing Magazine, 2017, 5(4): 37-78. |
| [6] | LI A J, JIAO L C, ZHU H, et al. Multitask semantic boundary awareness network for remote sensing image segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5400314. |
| [7] | LIU R, MI L, CHEN Z Z. AFNet: Adaptive fusion network for remote sensing image semantic segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(9): 7871-7886. |
| [8] | LI Y X, HOU Q B, ZHENG Z H, et al. Large selective kernel network for remote sensing object detection[C]∥2023 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2024: 16748-16759. |
| [9] | WANG L B, LI R, ZHANG C, et al. UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 190: 196-214. |
| [10] | WANG D, ZHANG Q M, XU Y F, et al. Advancing plain vision transformer toward remote sensing foundation model[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5607315. |
| [11] | LI R, ZHENG S Y, ZHANG C, et al. ABCNet: Attentive bilateral contextual network for efficient semantic segmentation of Fine-Resolution remotely sensed imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 181: 84-98. |
| [12] | KERVADEC H, BOUCHTIBA J, DESROSIERS C, et al. Boundary loss for highly unbalanced segmentation[J]. Medical Image Analysis, 2021, 67: 101851. |
| [13] | ZHAO S, WANG Y, YANG Z, et al. Region mutual information loss for semantic segmentation[C]∥Proceedings of the 33rd International Conference on Neural Information Processing Systems. New York:ACM,2019:11117-11127. |
| [14] | LI X L, CHEN J S, ZHAO L L, et al. Adaptive distance-weighted voronoi tessellation for remote sensing image segmentation[J]. Remote Sensing, 2020, 12(24): 4115. |
| [15] | BOKHOVKIN A, BURNAEV E. Boundary loss for remote sensing imagery semantic segmentation[C]∥Advances in Neural Networks-ISNN 2019. Cham: Springer, 2019: 388-401. |
| [16] | GüL F, APTOULA E. A distance transform based loss function for the semantic segmentation of very high resolution remote sensing images[C]∥IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium. Piscataway: IEEE Press, 2024: 9888-9891. |
| [17] | BERTASIUS G, SHI J B, TORRESANI L. Semantic segmentation with boundary neural fields[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 3602-3610. |
| [18] | 文贡坚, 王润生. 从航空遥感图像中自动提取主要道路[J]. 软件学报, 2000, 11(7): 957-964. |
| WEN G J, WANG R S. Automatic extraction of main roads from aerial remote sensing images[J]. Journal of Software, 2000, 11(7): 957-964 (in Chinese). | |
| [19] | SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640-651 |
| [20] | RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]∥Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015. Cham: Springer, 2015: 234-241. |
| [21] | CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]∥Computer Vision-ECCV 2018. Cham: Springer, 2018: 833-851. |
| [22] | BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. |
| [23] | SHEKHOVTSOV A, YANUSH V. Reintroducing straight-through estimators as principled methods for stochastic binary networks[C]∥Pattern Recognition (DAGM GCPR 2021). Cham: Springer, 2021: 111-126. |
| [24] | WANG J J, ZHENG Z, MA A L, et al. LoveDA: A remote sensing land-cover dataset for domain adaptive semantic segmentation[DB/OL]. arXiv: , 2021. |
| [25] | LYU Y, VOSSELMAN G, XIA G S, et al. UAVid: A semantic segmentation dataset for UAV imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 165: 108-119. |
| [26] | ISPRS. Potsdam 2D semantic labeling dataset[DB/OL]. (2015-04-01)[2025-06-20]. . |
| [27] | HWANG G, JEONG J, LEE S J. SFA-Net: Semantic feature adjustment network for remote sensing image segmentation[J]. Remote Sensing, 2024, 16(17): 3278. |
| [28] | WEI X Y, RAO L, FAN G Y, et al. MLFMNet: A multilevel feature mining network for semantic segmentation on aerial images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 16165-16179. |
| [29] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. |
| [30] | CHENG B W, GIRSHICK R, DOLLáR P, et al. Boundary IoU: Improving object-centric image segmentation evaluation[C]∥2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2021: 15329-15337. |
| [31] | MA A L, WANG J J, ZHONG Y F, et al. FactSeg: Foreground activation-driven small object semantic segmentation in large-scale remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5606216. |
| [32] | WANG L B, LI R, DUAN C X, et al. A novel transformer based semantic segmentation scheme for fine-resolution remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 6506105. |
| [33] | CHEN Y X, FANG P C, ZHONG X L, et al. Hi-ResNet: Edge detail enhancement for high-resolution remote sensing segmentation[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 15024-15040. |
| [34] | LI F, ZHANG H, XU H Z, et al. Mask DINO: Towards a unified transformer-based framework for object detection and segmentation[C]∥2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2023: 3041-3050. |
| [35] | HüMMER C, SCHWONBERG M, ZHOU L W, et al. Strong but simple: A baseline for domain generalized dense perception by CLIP-based transfer learning[C]∥Computer Vision-ACCV 2024. Singapore: Springer, 2025: 463-484. |
| [36] | MA X W, LIAN R R, WU Z K, et al. LOGCAN++: Adaptive local-global class-aware network for semantic segmentation of remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 4404216. |
| [37] | HANYU T, YAMAZAKI K, TRAN M, et al. AerialFormer: Multi-resolution transformer for aerial image segmentation[J]. Remote Sensing, 2024, 16(16): 2930. |
| [38] | XIE E Z, WANG W H, YU Z D, et al. SegFormer: Simple and efficient design for semantic segmentation with transformers[DB/OL]. arXiv preprint: 2105.15203, 2021. |
| [39] | WANG L B, LI R, WANG D Z, et al. Transformer meets convolution: A bilateral awareness network for semantic segmentation of very fine resolution urban scene images[J]. Remote Sensing, 2021, 13(16): 3065. |
| [40] | LU W, CHEN S B, SHU Q L, et al. DecoupleNet: A lightweight backbone network with efficient feature decoupling for remote sensing visual tasks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 4414613. |
| [41] | SUN H, XIE Y C, REN D, et al. MMLN: Multi-directional and multi-constraint learning network for remote sensing imagery semantic segmentation[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024: 1-16. |
| [42] | ALMARZOUQI H, SAAD SAOUD L. Semantic labeling of high-resolution images using Efficient UNets and transformers[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 4402913. |
| [43] | CHA K, SEO J, LEE T. A billion-scale foundation model for remote sensing images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024:1-17. |
/
| 〈 |
|
〉 |