强化学习驱动的退化遥感图像目标检测方法

doi:10.7527/S1000-6893.2025.32861

航天遥感图像智能处理与分析专刊

本期目录 | 过刊浏览 | 高级检索

前一篇 |

强化学习驱动的退化遥感图像目标检测方法

刘文林, 胡锡坤(), 钟平

国防科技大学电子科学学院，长沙 410073

收稿日期:2025-10-09 修回日期:2025-10-24 接受日期:2025-11-25 出版日期:2025-12-17 发布日期:2025-11-28
通讯作者: 胡锡坤 E-mail:xikun@nudt.edu.cn
基金资助:
国家自然科学基金(62301574);湖南省科技创新计划(2024RC3119)

Reinforcement learning-driven object detection method for degraded remote sensing images

Wenlin LIU, Xikun HU(), Ping ZHONG

College of Electronic Science and Technology，National University of Defense Technology，Changsha 410073，China

Received:2025-10-09 Revised:2025-10-24 Accepted:2025-11-25 Online:2025-12-17 Published:2025-11-28
Contact: Xikun HU E-mail:xikun@nudt.edu.cn
Supported by:
National Natural Science Foundation of China(62301574);Science and Technology Innovation Program of Hunan Province(2024RC3119)

摘要/Abstract

摘要：

卫星遥感图像目标检测技术是当前对地观测与智能解译的重要手段。然而，现有研究多集中于理想成像条件下，对于复杂天气、大气扰动及噪声干扰等复杂因素影响下的目标检测能力仍显不足。针对这一问题，一种强化学习驱动的退化遥感图像目标检测方法被提出，其通过动态编排图像预处理算子以实现复杂场景下的鲁棒检测。该方法的核心思想是以目标检测性能为优化目标，利用强化学习的决策优势，自适应地迭代选择并组合图像去噪、去模糊、对比度增强等预处理操作，从而提升遥感影像的质量与检测精度。在使用YOLO11m-OBB检测器基于DIOR和DOTA卫星遥感数据集构建的退化场景上进行的实验表明，所提方法均取得优异表现：在DIOR数据集上，相较于Raw-Syn（原始数据训练、退化场景数据验证）和Syn-Syn（退化场景数据训练与验证）方案，mAP₅₀分别提升11.1%和2.5%，最终达80.8%；在DOTA数据集上，mAP₅₀较Raw-Syn和Syn-Syn分别提升7.2%和2.8%，最终达76.6%。同时，处理后遥感影像的图像质量明显提升（PSNR>25 dB），充分验证了所提方法在复杂环境下的有效性与适用性。

关键词: 强化学习, 卫星遥感, 目标检测, 退化图像, 图像预处理, 自适应方法, 鲁棒性

Abstract:

Satellite remote sensing image object detection constitutes a pivotal technique for Earth observation and intelligent interpretation. However， most existing research has concentrated on ideal imaging conditions， and the resulting detection performance remains notably insufficient under complex degradations， such as adverse weather， atmospheric turbulence， and noise interference. To address this limitation， a reinforcement learning-based adaptive object detection methodology is proposed for degraded remote sensing images. This methodology achieves robust detection in complex scenarios by dynamically orchestrating image preprocessing operators. The core principle is to optimize object detection performance by leveraging reinforcement learning’s decision-making capability to adaptively and iteratively select and compose preprocessing operations， including denoising， deblurring， and contrast enhancement， thereby improving both remote sensing imagery quality and detection precision. Experiments on degraded scenarios constructed from the DIOR and DOTA satellite remote sensing datasets with the YOLO11m-OBB detector demonstrate that the proposed method achieves superior performance in all cases. On DIOR， the proposed method achieves mAP₅₀ improvements of 11.1% and 2.5% over Raw-Syn （trained on pristine data， validated on degraded data） and Syn-Syn （trained and validated on degraded data） baselines， respectively， achieving a final mAP₅₀ of 80.8%. On DOTA， mAP₅₀ is improved by 7.2% and 2.8% over the same baselines， reaching 76.6%. Furthermore， the quality of processed remote sensing imagery is significantly enhanced （PSNR > 25 dB）， substantiating the efficacy and applicability of the proposed approach in challenging environments.

Key words: reinforcement learning, satellite remote sensing, object detection, degraded images, image preprocessing, adaptive methods, robustness

中图分类号:

V474.2

刘文林, 胡锡坤, 钟平. 强化学习驱动的退化遥感图像目标检测方法[J]. 航空学报, 2026, 47(10): 532861.

Wenlin LIU, Xikun HU, Ping ZHONG. Reinforcement learning-driven object detection method for degraded remote sensing images[J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(10): 532861.

图/表 19

图 1

表1

表2

退化图像生成方式和参数

退化类型	生成方法
弱光	降低图像亮度至原亮度的30%~50%，叠加均值为0、方差0.01~0.03的高斯噪声
雾天	基于大气散射模型，设置雾浓度参数在0.001~0.005之间的随机值，生成均匀雾效
运动模糊	随机生成方向 $θ ∈ [0 ∘, 90 ∘]$ 、长度5~20的运动轨迹卷积核，对图像进行卷积操作
失焦模糊	采用半径 $r ∈ [3,8]$ 的圆盘形模糊核，模拟光学镜头失焦效果
毛玻璃模糊	对图像窗口大小 $3 × 3 ~ 7 × 7$ 的局部区域进行随机像素偏移，偏移量≤5像素
高斯噪声	添加均值为0、方差0.01~0.05的高斯分布噪声
JPEG压缩	采用JPEG标准压缩算法，设置质量因子 $Q ∈ [5,30]$
ISO噪声	模拟光电子统计特性，融合泊松噪声与高斯噪声

表2

图 2

图 3

表3

表4

表5

表6

表7

表8

图 4

表9

表10

表11

表12

表13

图 5

表14

参考文献 35

[1]	臧晶，李成华，田野. 卫星遥感农业监测系统中实例检索算法研究［J］. 宇航学报， 2019， 40（11）： 1358-1366.
	ZANG J， LI C H， TIAN Y. Research on case retrieval algorithm in satellite remote sensing monitoring system for agriculture［J］. Journal of Astronautics， 2019， 40（11）： 1358-1366 （in Chinese）.
[2]	李志忠，卫征，付垒，等. 我国遥感卫星技术与应用重要进展［J］. 卫星应用， 2025（4）： 16-19.
	LI Z Z， WEI Z， FU L， et al. Important progress in China’s remote sensing satellite technology and application［J］. Satellite Application， 2025（4）： 16-19 （in Chinese）.
[3]	王俊杰，李清泉，邬国锋. 红树林定量遥感研究进展［J］. 遥感学报， 2025， 29（6）： 1769-1787.
	WANG J J， LI Q Q， WU G F. Progress in quantitative remote sensing research of mangroves［J］. National Remote Sensing Bulletin， 2025， 29（6）： 1769-1787 （in Chinese）.
[4]	莫妮卡. 卫星遥感图像舰船目标检测系统［D］. 杭州：浙江大学，2022： 1-2.
	MO N K. Ship detection system based on satellite remote sensing images［D］. Hangzhou： Zhejiang University， 2022： 1-2 （in Chinese）.
[5]	刘瑞锦，何章鸣. 基于YOLOv8的卫星遥感图像快速目标检测方法［J］. 空间控制技术与应用， 2023， 49（5）： 89-97.
	LIU R J， HE Z M. A fast target detection method for satellite remote sensing images based on YOLOv8［J］. Aerospace Control and Application， 2023， 49（5）： 89-97 （in Chinese）.
[6]	赵其昌，吴一全，苑玉彬. 光学遥感图像舰船目标检测与识别方法研究进展［J］. 航空学报， 2024， 45（8）： 029025.
	ZHAO Q C， WU Y Q， YUAN Y B. Progress of ship detection and recognition methods in optical remote sensing images［J］. Acta Aeronautica et Astronautica Sinica， 2024， 45（8）： 029025 （in Chinese）.
[7]	XI Y， JIA W J， MIAO Q G， et al. CoDerainNet： Collaborative deraining network for drone-view object detection in rainy weather conditions［J］. Remote Sensing， 2023， 15（6）： 1487-1508.
[8]	ASWINI N， UMA S V. Drone image de-noising and feature extraction［C］∥2020 IEEE International Conference for Innovation in Technology. Piscataway： IEEE Press， 2020： 1-6.
[9]	KIM J I， HYUN C U， HAN H， et al. Digital surface model generation for drifting Arctic sea ice with low-textured surfaces based on drone images［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2021， 172： 147-159.
[10]	QIAN G C， WANG Y H， GU J J， et al. Rethinking learning-based demosaicing， denoising， and super-resolution pipeline［C］∥2022 IEEE International Conference on Computational Photography. Piscataway： IEEE Press， 2022： 1-12.
[11]	XING W Z， EGIAZARIAN K. End-to-end learning for joint image demosaicing， denoising and super-resolution［C］∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2021： 3507-3516.
[12]	SUGANUMA M， LIU X， OKATANI T. Attention-based adaptive selection of operations for image restoration in the presence of unknown combined distortions［C］∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2019： 9039-9048.
[13]	KIM C， KIM T H， BAIK S. LAN： Learning to adapt noise for image denoising［C］∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2024： 25193-25202.
[14]	LIU Y， LI W， GUAN J， et al. Effective cloud removal for remote sensing images by an improved mean-reverting denoising model with elucidated design space［C］∥Proceedings of the IEEE Computer Vision and Pattern Recognition Conference. Piscataway： IEEE Press， 2025： 17851-17861.
[15]	ZHANG J， ZHANG Q， ZHAO X， et al. Boosting denoisers with reinforcement learning for image restoration［J］. Soft Computing， 2022， 26（7）： 3261-3272.
[16]	FURUTA R， INOUE N， YAMASAKI T. PixelRL： Fully convolutional network with reinforcement learning for image processing［J］. IEEE Transactions on Multimedia， 2020， 22（7）： 1704-1719.
[17]	YU K， DONG C， LIN L， et al. Crafting a toolchain for image restoration by deep reinforcement learning［C］∥Proceedings of the IEEE Computer Vision and Pattern Recognition Conference. Piscataway： IEEE Press， 2018： 2443-2452.
[18]	SHIN U， LEE K， KWEON I S. DRL-ISP： Multi-objective camera ISP with deep reinforcement learning［C］∥2022 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway： IEEE Press， 2022： 7044-7051.
[19]	YU K， WANG X T， DONG C， et al. Path-restore： Learning network path selection for image restoration［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2022， 44（10）： 7078-7092.
[20]	WEI Z Y， CHEN H H， NAN L L， et al. PathNet： Path-selective point cloud denoising［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2024， 46（6）： 4426-4442.
[21]	范天麒，邹征夏，史振威. 基于强化学习数据合成的典型遥感目标检测［J］. 航空学报， 2025， 46（23）： 631955.
	FAN T Q， ZOU Z X， SHI Z W. Typical remote sensing target detection with data synthesis based on reinforcement learning［J］. Acta Aeronautica et Astronautica Sinica， 2025， 46（23）： 631955 （in Chinese）.
[22]	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］∥2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2016： 770-778.
[23]	RONNEBERGER O， FISCHER P， BROX T. U-Net： Convolutional networks for biomedical image segmentation［C］∥International Conference on Medical Image Computing and Computer-Assisted Intervention. 2015： 234-241.
[24]	HAARNOJA T， ZHOU A， ABBEEL P， et al. Soft actor-critic： Off-policy maximum entropy deep reinforcement learning with a stochastic actor［C］∥Proceedings of the 35th International Conference on Machine Learning. Piscataway： IEEE Press， 2018： 1861-1870.
[25]	MNIH V， KAVUKCUOGLU K， SILVER D， et al. Human-level control through deep reinforcement learning［J］. Nature， 2015， 518（7540）： 529-533.
[26]	SCHULMAN J， WOLSKI F， DHARIWAL P， et al. Proximal policy optimization algorithms［DB/OL］. arXiv preprint： 1707.06347， 2017.
[27]	REN S， HE K M， GIRSHICK R， et al. Faster R-CNN： Towards real-time object detection with region proposal networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（6）： 1137-1149.
[28]	JOCHER G， QIU J. Ultralytics YOLO11［EB/OL］. （2025-08-30）［2025-09-30］. .
[29]	XIE X X， CHENG G， WANG J B， et al. Oriented R-CNN for object detection［C］∥Proceedings of the IEEE International Conference on Computer Vision. Piscataway： IEEE Press， 2021： 3500-3509.
[30]	LI K， WAN G， CHENG G， et al. Object detection in optical remote sensing images： A survey and a new benchmark［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2020， 159： 296-307.
[31]	XIA G S， BAI X， DING J， et al. DOTA： A large-scale dataset for object detection in aerial images［C］∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2018： 3974-3983.
[32]	LIU Z， LIN Y T， CAO Y， et al. Swin Transformer： Hierarchical vision transformer using shifted windows［C］∥ Proceedings of the IEEE International Conference on Computer Vision. Piscataway： IEEE Press， 2021： 9992-10002.
[33]	LIN X Q， YU F H， HU J F， et al. Harnessing diffusion-yielded score priors for image restoration［J］. ACM Transactions on Graphics， 2025， 44（6）： 1-21.
[34]	HUANG X H， LIU S Q， ZHANG K， et al. Reverse convolution and its applications to image restoration［C］∥ Proceedings of the IEEE International Conference on Computer Vision. Piscataway： IEEE Press， 2025： 10507-10516.
[35]	TIAN Y， YE Q， DOERMANN D. YOLOv12： Attention-centric real-time object detectors［C］∥Advances in Neural Information Processing Systems 38 （NeurIPS 2025）. 2025： 1-12.

E-mail：hkxb@buaa.edu.cn

关于我们

期刊社服务

专业学科

封面文章

友情链接

主管单位：中国科学技术协会主办单位：中国航空学会北京航空航天大学

退化类型	传统方法	神经网络方法
弱光	亮度调整对比度调整	弱光增强网络
雾天	直方图均衡暗通道去雾	去雾网络
运动模糊	维纳滤波	去运动模糊网络
失焦模糊		去失焦模糊网络
毛玻璃模糊		去毛玻璃模糊网络
高斯噪声	均值滤波高斯滤波	去噪网络
JPEG压缩		JPEG压缩重构网络
ISO噪声		去ISO噪声网络
其他退化	Sobel算子双边滤波中值滤波伽马值调整拉普拉斯锐化

实验方案	PSNR（DIOR）/dB	PSNR（DOTA）/dB
无处理（基线）	19.63	19.21
Restore-Detect	27.64	29.91
RL-adapt	25.77	25.79

检测算法	方案	mAP₅₀/%
检测算法	方案	DIOR	DOTA
YOLOv5-m	Raw-Raw	81.0	73.2
	Raw-Syn	71.2	65.4
	Syn-Syn	74.9	68.7
	Restore-Detect	75.2	69.1
	E2E	75.0	68.1
	RL-adapt	78.6	72.0
YOLO11m-OBB	Raw-Raw	81.8	78.3
	Raw-Syn	69.7	69.4
	Syn-Syn	78.3	73.8
	Restore-Detect	78.5	73.1
	E2E	78.0	72.2
	RL-adapt	80.8	76.6
Oriented R-CNN	Raw-Raw	71.1	80.1
	Raw-Syn	50.9	71.2
	Syn-Syn	61.2	73.3
	Restore-Detect	63.4	72.9
	E2E	64.0	73.5
	RL-adapt	69.0	76.4

迁移场景	方法	mAP₅₀/%	PSNR/dB
DIOR→DOTA	基线	69.4	19.21
	零样本	74.3	23.18
	少样本	75.9	24.51
	全样本	76.6	25.79
DOTA→DIOR	基线	71.2	19.13
	零样本	76.1	23.08
	少样本	77.8	24.44
	全样本	78.6	25.77

迁移场景设置	方法	mAP₅₀/%	PSNR/dB
已知噪声→雨天	基线	65.9	19.52
已知噪声→雨天	RL-adapt	72.5	21.34
已知噪声→条带	基线	69.3	20.85
已知噪声→条带	RL-adapt	73.4	22.53
已知噪声→ 雨天+条带	基线	59.7	18.17
已知噪声→ 雨天+条带	RL-adapt	70.2	20.78

强化学习驱动的退化遥感图像目标检测方法

Reinforcement learning-driven object detection method for degraded remote sensing images

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 19

参考文献 35

相关文章 15

编辑推荐

Metrics

本文评价

复合干扰类型	退化组合机制
雾天+高斯噪声	大气散射+传感器热噪声
运动模糊+JPEG压缩	卫星姿态抖动+数据传输损耗
弱光+ISO噪声	弱光成像+高感光度噪声
雾天+运动模糊+ 高斯噪声	大气散射+卫星姿态抖动+ 传感器噪声

退化类型	方案	mAP₅₀/%	PSNR/dB
雾天+高斯噪声	Raw-Syn	62.4	16.19
	Syn-Syn	65.3
	R-D	67.6	25.27
	E2E	66.9
	RL-adapt	68.4	24.51
运动模糊+JPEG压缩	Raw-Syn	61.1	15.13
	Syn-Syn	64.2
	R-D	65.4	25.42
	E2E	67.0
	RL-adapt	69.0	24.44
弱光+ISO噪声	Raw-Syn	63.9	16.12
	Syn-Syn	65.7
	R-D	65.4	23.1
	E2E	65.3
	RL-adapt	67.9	20.1
雾天+运动模糊+ 高斯噪声	Raw-Syn	60.1	15.91
	Syn-Syn	65.2
	R-D	65.4	22.9
	E2E	65.3
	RL-adapt	67.3	19.8

迭代次数上限/次	mAP₅₀/%	PSNR/dB	推理时间/ms
2	75.2	23.31	13.1
3	77.1	24.29	14.4
4	78.0	25.04	15.3
5	78.6	25.77	16.7
6	78.7	25.69	18.3
7	78.4	25.91	20.6

实验分组	mAP₅₀/%	PSNR/dB	训练时间/h
实验组1	77.2	24.94	25.2
实验组2	78.6	25.77	29.4
实验组3	78.1	25.17	35.1

终止策略	mAP₅₀/%	PSNR/dB
PSNR和mAP动态混合（提出的策略）	78.6	25.77
只根据PSNR	76.3	25.89
只根据mAP	77.2	24.32

复原+检测算法	mAP₅₀/%	PSNR/dB
原始退化数据	69.4	19.63
HYPIR+YOLO11	75.8	24.33
Converse_DnCNN+YOLO11	78.1	26.37
RL-adapt （UNet+YOLO12）	78.5	25.77
RL-adapt （UNet+YOLO11）	78.6	25.77

方案	训练用时/h	推理用时/ms
YOLO11m-OBB	4.8	12
UNet+YOLO11m-OBB	10.2	18
RL-adapt	70.3	25
HYPIR+YOLO11m-OBB		1 221

[1]	刘宇衡, 杨力, 黄琦龙. 基于可解释分层强化学习的防空反导策略优化[J]. 航空学报, 2026, 47(8): 332786-332786.
[2]	张皓, 刘家宁, 许志, 杨垣鑫. 飞机主动防御模式下改进逆强化学习的来袭弹轨迹预测方法[J]. 航空学报, 2026, 47(8): 332753-332753.
[3]	熊威, 张栋, 杨书恒, 任智, 刘文逸. 面向智能空战有人/无人机协同可解释方法[J]. 航空学报, 2026, 47(7): 332547-332547.
[4]	韩滟泷, 张安, 毕文豪, 范秋岑, 侯天乐. 基于DACTM-PPO的机载末端红外复合干扰智能决策[J]. 航空学报, 2026, 47(7): 332759-332759.
[5]	高思华, 赵炳阳, 李建伏. 基于时间窗约束的无人机完整性数据采集路径规划算法[J]. 航空学报, 2026, 47(6): 332451-332451.
[6]	廉云霄, 李霓, 谢锋, 周攀, 董长印. 基于时空信息融合的多机协同空战决策方法[J]. 航空学报, 2026, 47(6): 332633-332633.
[7]	何有宸, 谭贤四, 曲智国, 侯铭. 多拓扑结构下空天防御装备协同动力学建模与理论分析[J]. 航空学报, 2026, 47(5): 332253-332253.
[8]	张磊, 田灿, 文方青, 张清河, 刘含. 面向移动边缘网络的多目标进化深度确定性策略梯度算法[J]. 航空学报, 2026, 47(3): 631880-631880.
[9]	沈博, 马倩, 张志翔, 杨刚, 贾伟. 无人集群协同感知鲁棒性智能评估方法[J]. 航空学报, 2026, 47(1): 332118-332118.
[10]	马赞, 白杰, 闫励勤, 陈勇, 孙淑光. 基于贝叶斯优化的机载智能避让系统安全性评估[J]. 航空学报, 2026, 47(1): 331973-331973.
[11]	章涛, 李攀, 王梓旭, 朱振华. 面向直升机姿态控制的强化学习奖励函数设计[J]. 航空学报, 2025, 46(S1): 732184-732184.
[12]	万开方, 吴志林, 武韫晖, 强皓植, 吴艺博, 李波. 拒止环境下基于深度强化学习的多无人机协同定位[J]. 航空学报, 2025, 46(8): 331024-331024.
[13]	姜凌峰, 李新凯, 张海, 李涵玮, 张宏立. 基于改进TD3算法的无人机动态环境无地图导航[J]. 航空学报, 2025, 46(8): 331035-331035.
[14]	杨敏, 刘关俊, 周子渊. 基于安全强化学习的月球着陆器控制[J]. 航空学报, 2025, 46(3): 630553-630553.
[15]	吴一全, 童康. 基于深度学习的无人机航拍图像小目标检测研究进展[J]. 航空学报, 2025, 46(3): 30848-030848.