基于距离损失提升航空图像语义分割研究

doi:10.7527/S1000-6893.2025.32780

Abstract

Abstract:

The advancements in deep learning and computer vision technologies have had a profound impact on the field of air-borne remote sensing， making the analysis of aerial images more efficient. Compared to conventional images， the target boundaries in aerial images are clearer and more distinct， with more regular distributions and stronger spatial structure. However， current state-of-the-art segmentation methods mainly focus on utilizing complex feature extractors to capture stronger contextual relationships， placing more emphasis on single-pixel classification accuracy. This not only demands higher hardware requirements but also overlooks the issue of boundary alignment from a structural perspective. To address this challenge， we propose an innovative boundary-aware loss function， Lossd， designed to enhance the performance of semantic segmentation for aerial remote sensing images， particularly in terms of boundary precision and target segmentation consistency. We innovatively translate structural differences into a loss， unlike traditional methods that focus on single-pixel accuracy. Moreover， we propose an effective solution for the common over-segmentation and under-segmentation problems in semantic segmentation tasks. Extensive experimental validation has been conducted on three widely used large-scale datasets and three benchmark models. Experimental results show that our method significantly improves the semantic segmentation performance without modifying the original network. Specialty， our method achieves 55.8% mIoU （+1.6%） on LoveDA， 70.8% mIoU （+0.8%） on UAVid， and 94.1% mF1 （+0.7%） on Potsdam， reaching or partially surpassing the performance of mainstream approaches.

Key words: aerial images, deep learning, semantic segmentation, loss function, boundary constraints

CLC Number:

Yuzhuo MA, Kan REN, Tao LI, Qian CHEN. Improving remote sensing image semantic segmentation based on distance loss[J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(8): 332780.

Figures/Tables 20

Fig.1

Fig.2

Fig.3

Fig.4

Table 1

Table 2

Table 3

Table 4

Fig.5

Fig.6

Fig.7

Table 5

Table 6

Table 7

Table 8

Table 9

Table 10

Table 11

Table 12

Table 13

References 43

[1]	罗旭东，吴一全，陈金林. 无人机航拍影像目标检测与语义分割的深度学习方法研究进展［J］. 航空学报， 2024， 45（6）： 028822.
	LUO X D， WU Y Q， CHEN J L. Research progress on deep learning methods for object detection and semantic segmentation in UAV aerial images［J］. Acta Aeronautica et Astronautica Sinica， 2024， 45（6）： 028822 （in Chinese）.
[2]	江波，屈若锟，李彦冬，等. 基于深度学习的无人机航拍目标检测研究综述［J］. 航空学报， 2021， 42（4）： 524519.
	JIANG B， QU R K， LI Y D， et al. Object detection in UAV imagery based on deep learning： Review［J］. Acta Aeronautica et Astronautica Sinica， 2021， 42（4）： 524519 （in Chinese）.
[3]	CHANG F Z， MA T Y， WANG D T， et al. Method for building segmentation and extraction from high-resolution remote sensing images based on improved YOLOv5ds［J］. PLOS One， 2025， 20（3）： e0317106.
[4]	LIU Q H， KAMPFFMEYER M， JENSSEN R， et al. Multi-modal land cover mapping of remote sensing images using pyramid attention and gated fusion networks［J］. International Journal of Remote Sensing， 2022， 43（9）： 3509-3535.
[5]	GHAMISI P， YOKOYA N， LI J， et al. Advances in hyperspectral image and signal processing： A comprehensive overview of the state of the art［J］. IEEE Geoscience and Remote Sensing Magazine， 2017， 5（4）： 37-78.
[6]	LI A J， JIAO L C， ZHU H， et al. Multitask semantic boundary awareness network for remote sensing image segmentation［J］. IEEE Transactions on Geoscience and Remote Sensing， 2022， 60： 5400314.
[7]	LIU R， MI L， CHEN Z Z. AFNet： Adaptive fusion network for remote sensing image semantic segmentation［J］. IEEE Transactions on Geoscience and Remote Sensing， 2021， 59（9）： 7871-7886.
[8]	LI Y X， HOU Q B， ZHENG Z H， et al. Large selective kernel network for remote sensing object detection［C］∥2023 IEEE/CVF International Conference on Computer Vision （ICCV）. Piscataway： IEEE Press， 2024： 16748-16759.
[9]	WANG L B， LI R， ZHANG C， et al. UNetFormer： A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2022， 190： 196-214.
[10]	WANG D， ZHANG Q M， XU Y F， et al. Advancing plain vision transformer toward remote sensing foundation model［J］. IEEE Transactions on Geoscience and Remote Sensing， 2023， 61： 5607315.
[11]	LI R， ZHENG S Y， ZHANG C， et al. ABCNet： Attentive bilateral contextual network for efficient semantic segmentation of Fine-Resolution remotely sensed imagery［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2021， 181： 84-98.
[12]	KERVADEC H， BOUCHTIBA J， DESROSIERS C， et al. Boundary loss for highly unbalanced segmentation［J］. Medical Image Analysis， 2021， 67： 101851.
[13]	ZHAO S， WANG Y， YANG Z， et al. Region mutual information loss for semantic segmentation［C］∥Proceedings of the 33rd International Conference on Neural Information Processing Systems. New York：ACM，2019：11117-11127.
[14]	LI X L， CHEN J S， ZHAO L L， et al. Adaptive distance-weighted voronoi tessellation for remote sensing image segmentation［J］. Remote Sensing， 2020， 12（24）： 4115.
[15]	BOKHOVKIN A， BURNAEV E. Boundary loss for remote sensing imagery semantic segmentation［C］∥Advances in Neural Networks-ISNN 2019. Cham： Springer， 2019： 388-401.
[16]	GÜL F， APTOULA E. A distance transform based loss function for the semantic segmentation of very high resolution remote sensing images［C］∥IGARSS 2024-2024 IEEE International Geoscience and Remote Sensing Symposium. Piscataway： IEEE Press， 2024： 9888-9891.
[17]	BERTASIUS G， SHI J B， TORRESANI L. Semantic segmentation with boundary neural fields［C］∥2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2016： 3602-3610.
[18]	文贡坚，王润生. 从航空遥感图像中自动提取主要道路［J］. 软件学报， 2000， 11（7）： 957-964.
	WEN G J， WANG R S. Automatic extraction of main roads from aerial remote sensing images［J］. Journal of Software， 2000， 11（7）： 957-964 （in Chinese）.
[19]	SHELHAMER E， LONG J， DARRELL T. Fully convolutional networks for semantic segmentation［J］.IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（4）： 640-651
[20]	RONNEBERGER O， FISCHER P， BROX T. U-Net： Convolutional networks for biomedical image segmentation［C］∥Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015. Cham： Springer， 2015： 234-241.
[21]	CHEN L C， ZHU Y K， PAPANDREOU G， et al. Encoder-decoder with atrous separable convolution for semantic image segmentation［C］∥Computer Vision-ECCV 2018. Cham： Springer， 2018： 833-851.
[22]	BADRINARAYANAN V， KENDALL A， CIPOLLA R. SegNet： A deep convolutional encoder-decoder architecture for image segmentation［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（12）： 2481-2495.
[23]	SHEKHOVTSOV A， YANUSH V. Reintroducing straight-through estimators as principled methods for stochastic binary networks［C］∥Pattern Recognition （DAGM GCPR 2021）. Cham： Springer， 2021： 111-126.
[24]	WANG J J， ZHENG Z， MA A L， et al. LoveDA： A remote sensing land-cover dataset for domain adaptive semantic segmentation［DB/OL］. arXiv：， 2021.
[25]	LYU Y， VOSSELMAN G， XIA G S， et al. UAVid： A semantic segmentation dataset for UAV imagery［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2020， 165： 108-119.
[26]	ISPRS. Potsdam 2D semantic labeling dataset［DB/OL］. （2015-04-01）［2025-06-20］. .
[27]	HWANG G， JEONG J， LEE S J. SFA-Net： Semantic feature adjustment network for remote sensing image segmentation［J］. Remote Sensing， 2024， 16（17）： 3278.
[28]	WEI X Y， RAO L， FAN G Y， et al. MLFMNet： A multilevel feature mining network for semantic segmentation on aerial images［J］. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 2024， 17： 16165-16179.
[29]	KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks［J］. Communications of the ACM， 2017， 60（6）： 84-90.
[30]	CHENG B W， GIRSHICK R， DOLLÁR P， et al. Boundary IoU： Improving object-centric image segmentation evaluation［C］∥2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2021： 15329-15337.
[31]	MA A L， WANG J J， ZHONG Y F， et al. FactSeg： Foreground activation-driven small object semantic segmentation in large-scale remote sensing imagery［J］. IEEE Transactions on Geoscience and Remote Sensing， 2022， 60： 5606216.
[32]	WANG L B， LI R， DUAN C X， et al. A novel transformer based semantic segmentation scheme for fine-resolution remote sensing images［J］. IEEE Geoscience and Remote Sensing Letters， 2022， 19： 6506105.
[33]	CHEN Y X， FANG P C， ZHONG X L， et al. Hi-ResNet： Edge detail enhancement for high-resolution remote sensing segmentation［J］. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 2024， 17： 15024-15040.
[34]	LI F， ZHANG H， XU H Z， et al. Mask DINO： Towards a unified transformer-based framework for object detection and segmentation［C］∥2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2023： 3041-3050.
[35]	HÜMMER C， SCHWONBERG M， ZHOU L W， et al. Strong but simple： A baseline for domain generalized dense perception by CLIP-based transfer learning［C］∥Computer Vision-ACCV 2024. Singapore： Springer， 2025： 463-484.
[36]	MA X W， LIAN R R， WU Z K， et al. LOGCAN++： Adaptive local-global class-aware network for semantic segmentation of remote sensing images［J］. IEEE Transactions on Geoscience and Remote Sensing， 2025， 63： 4404216.
[37]	HANYU T， YAMAZAKI K， TRAN M， et al. AerialFormer： Multi-resolution transformer for aerial image segmentation［J］. Remote Sensing， 2024， 16（16）： 2930.
[38]	XIE E Z， WANG W H， YU Z D， et al. SegFormer： Simple and efficient design for semantic segmentation with transformers［DB/OL］. arXiv preprint： 2105.15203， 2021.
[39]	WANG L B， LI R， WANG D Z， et al. Transformer meets convolution： A bilateral awareness network for semantic segmentation of very fine resolution urban scene images［J］. Remote Sensing， 2021， 13（16）： 3065.
[40]	LU W， CHEN S B， SHU Q L， et al. DecoupleNet： A lightweight backbone network with efficient feature decoupling for remote sensing visual tasks［J］. IEEE Transactions on Geoscience and Remote Sensing， 2024， 62： 4414613.
[41]	SUN H， XIE Y C， REN D， et al. MMLN： Multi-directional and multi-constraint learning network for remote sensing imagery semantic segmentation［J］. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 2024： 1-16.
[42]	ALMARZOUQI H， SAAD SAOUD L. Semantic labeling of high-resolution images using Efficient UNets and transformers［J］. IEEE Transactions on Geoscience and Remote Sensing， 2023， 61： 4402913.
[43]	CHA K， SEO J， LEE T. A billion-scale foundation model for remote sensing images［J］. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 2024：1-17.

数据集名称	数据集划分			语义掩码数
数据集名称	训练集	验证集	测试集	语义掩码数
UAVid	1~15，31~35	16~20，36，37	21~30，38~42	8
Potsdam	2_11，2_12，3_10，3_11，3_12，4_10，4_12， 5_10，5_11，5_12，6_7，6_8，6_9，6_10，6_11， 6_12，7_7，7_8，7_9，7_10，7_11，7_12	2_10	2_13，2_14，3_13，3_14，4_13，4_14，4_15 5_13，5_14，5_15，6_13，6_14，6_15，7_13	6
LoveDA	0~2 521	2 522~4 190	4 191~5 986	7

类型	方法	来源	mIoU/%	IoU pre category/%
类型	方法	来源	mIoU/%	建筑	道路	水	荒地	农田	背景	森林
方法其他先进	HRNetw32^［24］	NeurlPS 2021	49.8	55.3	57.4	80.0	11.1	60.9	44.6	45.2
	FactSeg^［31］	TGRS 2022	48.9	53.6	52.8	76.9	16.2	57.5	42.6	42.9
	DC-Swin^［32］	GRSL 2022	50.6	54.5	56.2	78.1	14.5	62.4	41.3	47.2
	Hi-ResNet^［33］	JSTARS 2023	52.5	58.3	55.9	80.1	17.0	62.7	46.7	46.7
	Mask DINO^［34］	CVPR 2023	52.6	60.0	55.1	79.8	20.3	62.7	44.9	46.2
	VLTSeg^［35］	ACCV 2024	53.8	57.9	61.3	80.5	24.1	60.2	45.8	46.5
	LOGCAN^［36］	TGRS 2024	53.4	58.4	56.5	80.1	18.4	56.8	47.4	47.9
	AerialFormer-B^［37］	RS 2024	54.1	60.7	59.3	81.5	17.9	64.0	47.8	47.9
基准方法	UNetFormer	ISPRS 2022	51.9	57.9	54.1	79.1	19.8	62.3	44.2	45.7
	MLFMNet-B^［28］	JSTARS 2024	53.1	60.8	57.2	81.3	17.5	61.6	45.8	47.4
	SFA-Net	RS 2024	54.2	61	57.8	81.4	21.5	64.8	47.3	45.8
Lossd增强方法	UNetFormer + Ours		53.4 （+1.5）	62.1	57.5	81.8	20.3	63.0	44.9	46.1
	MLFMNet-B + Ours		54.3 （+1.2）	65.5	60.5	82.8	20.0	62.8	46.3	47.7
	SFA-Net + Ours		55.8 （+1.6）	65.4	60.7	83.2	21.9	65.7	47.4	46.5

类型	方法	来源	mIoU/%	IoU pre category/%
类型	方法	来源	mIoU/%	建筑	道路	树	植被	运动车	静态车	行人	背景
方法其他先进	SegFomer^［38］	NeurlPS 2021	65.4	85.4	79.9	78.5	61.8	71.8	52.1	27.8	66.3
	BANet^［39］	RS 2021	66.0	84.5	80.0	78.3	61.3	58.8	52.2	19.9	66.3
	DC-Swin^［32］	GRSL 2022	68.8	88.5	82.7	80.0	64.6	74.1	59.3	30.7	70.3
	DecoupleNet D2^［40］	TGRS 2024	65.1	84.4	79.9	78.2	61.3	73.6	48.8	30.2	64.6
	MMLN^［41］	JSTARS 2024	69.5	88.4	81.9	80.9	65.7	62.0	74.8	32.5	69.6
	Mask DINO^［34］	CVPR 2023	67.9	87.3	81.5	80.2	63.7	73.6	56.2	31.0	68.6
	VLTSeg^［35］	ACCV 2024	69.5	89.6	83.3	80.7	65.1	74.9	59.7	31.9	70.6
基准方法	UNetFormer	ISPRS 2022	67.4	87.2	81.1	79.8	63.1	73.3	55.9	30.6	68.2
	MLFMNet-B	JSTARS 2024	70.0	89.5	82.2	81.0	64.3	76.1	64.7	32.3	70.0
	SFA-Net	RS 2024	69.9	88.7	82.4	80.4	64	77.1	66.9	30.2	69.7
方法 Lossd增强	UNetFormer + Ours		67.8 （+0.4）	88.1	81.9	80.2	63.3	73.5	56.2	30.9	68.5
	MLFMNet-B + Ours		70.8 （+0.8）	91.4	83.5	81.4	64.9	76.4	65.6	32.6	70.3
	SFA-Net + Ours		70.5 （+0.6）	90.0	83.8	80.9	64.7	77.2	67.3	30.3	70.1

类型	方法	来源	mF1/%	F1 pre category/%
类型	方法	来源	mF1/%	硬化面	建筑	低矮植被	树	汽车
方法其他先进	DC-Swin^［32］	GRSL 2022	93.3	94.2	97.6	88.6	89.6	96.3
	EfficientUNets^［42］	TGRS 2023	93.5	94.8	98.2	89.5	90.5	94.6
	Mask DINO^［34］	CVPR 2023	93.2	94.1	96.9	89.5	88.7	96.8
	VLTSeg^［35］	ACCV 2024	93.8	95.2	97.4	89.3	89.2	98.0
	Vit-G12X4^［43］	JSTARS 2024	92.1	92.8	96.9	85.9	89.0	96.0
	AerialFormer-B^［37］	RS 2024	94.1	95.5	98.1	89.8	89.8	97.5
基准方法	UNetFormer	ISPRS 2022	92.3	93.1	96.7	87.4	88.5	96.0
	MLFMNet-B	JSTARS 2024	93.4	94.6	97.5	88.4	88.7	96.9
	SFA-Net	RS 2024	93.2	94.6	97.2	88.1	89.5	96.7
方法 Lossd 增强	UNetFormer + Ours		93.1 （+0.8）	94.4	98.3	87.7	88.9	96.4
	MLFMNet-B + Ours		94.1 （+0.7）	96.1	98.9	89.2	89.3	97.2
	SFA-Net + Ours		94.0 （+0.8）	96.2	98.4	88.6	90.1	96.9

指标	N=0	N=50	N=80	N=90	N=100	N=120
mIoU/%	54.2	55.1	55.4	55.6	55.2	55.0
变化幅度/%	0	+0.9	+1.2	+1.4	+1.0	+0.8

Improving remote sensing image semantic segmentation based on distance loss

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 20

References 43

Related Articles 15

Recommended Articles

Metrics

Comments

指标	N=0	N=50	N=60	N=70	N=80	N=100
mIoU/%	53.1	53.5	53.6	54.0	53.7	53.6
变化幅度/%	0	+0.4	+0.5	+0.9	+0.6	+0.5

模型	α=0	α=0.05	α=0.1	α=0.2	α=0.33	α=0.4	α=0.5
SFA-Net/%	54.2	55.7（+0.4）	55.8（+1.6）	55.6（+1.4）	54.9（+0.7）	54.3（+0.1）	53.4（-0.8）
MLFMNet/%	53.1	53.2（+0.1）	53.5（+0.4）	53.9（+0.8）	54.3（+1.2）	53.7（+0.6）	53.2（+0.1）
UNetFormer/%	51.9	52.2（+0.2）	52.4（+0.5）	53.0（+1.2）	53.4（+1.5）	52.2（+0.3）	50.8（-1.3）

方法	训练时间倍数	mIoU/%	IoU pre category/%
方法	训练时间倍数	mIoU/%	建筑	道路	水	荒地	森林	农田	背景
SFA-Net	1	54.2	61	57.8	81.4	21.5	45.8	64.8	47.3
SFA-Net+曼哈顿距离	1.43	55.8（+1.6）	65.4	60.7	83.2	21.9	46.5	65.7	47.4
SFA-Net+欧几里得距离	3.25	55.7（+1.4）	63.6	59.8	84.0	21.9	47.1	65.3	47.5

方法	mIoU/%
SFA-Net	54.2
SFA-Net+前景距离	54.9（+0.7）
SFA-Net+背景距离	54.3（+0.1）
SFA-Net+双向距离	55.8（+1.6）

方法	mBIoU/%	BIoU pre category/%
方法	mBIoU/%	建筑	道路	水	荒地	森林	农田	背景
SFA-Net	23.5	25.7	21.6	54.4	6.9	14.5	30.4	11.2
SFA-Net+Lossd	30.9	42.6	29.2	64.1	9.7	19.1	34.6	16.4

分辨率	mBIoU pre category/%
分辨率	SFA-Net	SFA-Net+Lossd
224×224	19.7	23.5（+3.8）
384×384	21.1	25.6（+4.5）
512×512	23.5	30.9（+7.4）

[1]	Jun HUANG, Jing ZHANG, Shiqian WENG. Airborne electro-optical target recognition algorithms [J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(6): 332601-332601.
[2]	Leyan LI, Rennong YANG, Anxin GUO, Qi SONG, Jialiang ZUO. Beyond-visual-range air combat threat prediction and dynamic evasion method based on all-domain fire field theory [J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(4): 332205-332205.
[3]	Zicheng FENG, Wenlong ZHANG, Donghui LIU, Qifeng YU. Robust infrared target tracking algorithm for anti-UAV in complexbackgrounds [J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(4): 332264-332264.
[4]	Ye TAO, Jinhui TANG, Zhen YAN, Chen ZHOU, Chong WANG. A trajectory imputation method integrating representation transformation and pattern regression [J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(1): 332106-332106.
[5]	Jianyu XU, Li ZHOU, Zhanxue WANG, Jie SHI, Hao SHI. Calculation method for hypersonic plume infrared radiation based on a fast line-by-line calculation model [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(8): 630778-630778.
[6]	Lingjie MENG, Hongguang LI, Xinjun LI. SAR image simulation method guided by geomorphic category information [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(7): 331003-331003.
[7]	Zhihao ZHAO, Zhaohua YANG, Yun WU, Yuanjin YU. Single-photon counting imaging denoising method based on deep learning in low-light environment [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(3): 630531-630531.
[8]	Yiquan WU, Kang TONG. Research advances on deep learning-based small object detection in UAV aerial images [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(3): 30848-030848.
[9]	Zijian XIANG, Zhenyu MA, Xixiang YANG. Inversion of structural performance parameters of composite materials based on deep learning [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(24): 231877-231877.
[10]	Runmin CONG, Haoyan SUN, Yuxuan LUO, Hao FANG. Generalized few-shot segmentation for remote sensing image based on class relation mining [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 631694-631694.
[11]	Tianqi FAN, Zhengxia ZOU, Zhenwei SHI. Typical remote sensing target detection with data synthesis based on reinforcement learning [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 631955-631955.
[12]	Kui LIU, Hao SUN, Han WU, Kefeng JI, Gangyao KUANG. Dynamic brightness reconstruction for UAV visible-infrared fusion object detection [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 631968-631968.
[13]	Bo PENG, Jikang BAI, Weiwen CHEN, Xiangtao ZHENG, Jianjun LEI, Xiaoqiang LU. Research progress for UAV search and rescue methods based on deep learning [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 632761-632761.
[14]	Jiaxin LI, Shuaishuai LYU, Yezi WANG, Yu YANG, Ziyue LI. Transformer-based intelligent tracking method of aviation structure surface cracks [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(21): 532355-532355.
[15]	Xiaowei JIANG, Yiquan WU. Research progress of UAV aerial image mosaic methods [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(17): 331799-331799.