special column

RS-AdaDiff: One-step remote sensing image super-resolution diffusion model with degradation-aware adaptive estimation

  • Fei WANG ,
  • Yong LIU ,
  • Jiawei YAO ,
  • Xuanlei ZHU ,
  • Xiaoqiang LU ,
  • Wenxing GUO ,
  • Xuetao ZHANG ,
  • Yu GUO
Expand
  • 1.National Key Laboratory of Human-Machine Hybrid Augmented Intelligence,Xi’an Jiaotong University,Xi’an 710049,China
    2.National Engineering Research Center of Visual Information and Applications,Xi’an Jiaotong University,Xi’an 710049,China
    3.Institute of Artificial Intelligence and Robotics,Xi’an Jiaotong University,Xi’an 710049,China
    4.College of Physics and Information Engineering,Fuzhou University,Fuzhou 350108,China
E-mail: yu.guo@xjtu.edu.cn

Received date: 2025-09-06

  Revised date: 2025-09-24

  Accepted date: 2025-10-20

  Online published: 2025-11-13

Supported by

National Major Science and Technology Projects of China(2009XJTU0016)

Abstract

Diffusion models have demonstrated great potential in generating realistic image details. However, existing diffusion models are primarily trained on natural images, making their application to remote sensing image super-resolution highly challenging. Moreover, these models typically require dozens or even hundreds of iterative sampling steps during inference, resulting in high computational costs and limited practicality. To address these issues, this paper proposes a degradation-aware adaptive estimation-based single-step remote sensing image super-resolution diffusion model (RS-AdaDiff), which balances reconstruction performance and inference efficiency. Specifically, we propose a degradation-aware timestep estimation module that adaptively estimates the diffusion timestep for the diffusion model by assessing the degradation level of the input image. This approach reconstructs the iterative denoising process into a single-step reconstruction from low-resolution to high-resolution images, thereby significantly accelerating inference. Meanwhile, we integrate trainable lightweight LoRA layers into a pre-trained diffusion model and fine-tune it on a remote sensing image dataset to mitigate the domain gap caused by data distribution differences. Additionally, to fully leverage the image priors of the pre-trained model, we introduce distribution contrastive matching distillation. By regularizing the KL divergence, the reconstructed super-resolved images are brought closer to high-resolution images and farther from low-resolution images in the feature space, thereby improving generation quality. Finally, we propose a feature-edge joint perceptual similarity loss to enhance the perception of structural information and mitigate issues such as edge blur and texture distortion. Extensive experimental results demonstrate that the proposed RS-AdaDiff outperforms existing state-of-the-art methods on multiple public remote sensing datasets, achieving significant improvements in both quantitative metrics and visual quality, and producing super-resolved remote sensing images with clearer structures and richer details.

Cite this article

Fei WANG , Yong LIU , Jiawei YAO , Xuanlei ZHU , Xiaoqiang LU , Wenxing GUO , Xuetao ZHANG , Yu GUO . RS-AdaDiff: One-step remote sensing image super-resolution diffusion model with degradation-aware adaptive estimation[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2025 , 46(23) : 632763 -632763 . DOI: 10.7527/S1000-6893.2025.32763

References

[1] BANDARA W G C, NAIR N G, PATEL V M. DDPM-CD: Remote sensing change detection using denoising diffusion probabilistic models[DB/OL]. arXiv preprint: 2206.11892, 2022.
[2] 赵军利, 李向英, 陈占龙, 等. 基于遥感影像军事地质信息提取及应用研究现状[J]. 地质论评202571(3): 848-866.
  ZHAO J L, LI X Y, CHEN Z L, et al. Current research status on the extraction and application of military geological information based on remote sensing images[J]. Geological Review202571(3): 848-866 (in Chinese).
[3] 秦杨, 黄孝森. 遥感技术在全域土地综合整治中的应用[J]. 智能建筑与智慧城市2025(5): 43-45.
  QIN Y, HUANG X S. The application of remote sensing technology in the whole land comprehensive consolidation[J]. Intelligent Building & Smart City2025(5): 43-45 (in Chinese).
[4] 刘延芳, 佘佳宇, 袁秋帆, 等. 无人机遥感图像实时小目标检测方法[J]. 航空学报202445(14): 630119.
  LIU Y F, SHE J Y, YUAN Q F, et al. Real-time small target detection networks for UAV remote sensing[J]. Acta Aeronautica et Astronautica Sinica202445(14): 630119 (in Chinese).
[5] 王子玲, 熊振宇, 杨璐铖, 等. AIS和光学遥感图像引导的星载SAR舰船目标识别网络[J]. 航空学报202445(2): 328672.
  WANG Z L, XIONG Z Y, YANG L C, et al. Spaceborne SAR ship target recognition network guided by AIS and optical remote sensing images[J]. Acta Aeronautica et Astronautica Sinica202445(2): 328672 (in Chinese).
[6] LEI S, SHI Z W, ZOU Z X. Super-resolution for remote sensing images via local-global combined network[J]. IEEE Geoscience and Remote Sensing Letters201714(8): 1243-1247.
[7] LIM B, SON S, KIM H, et al. Enhanced deep residual networks for single image super-resolution[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway: IEEE Press, 2017: 1132-1140.
[8] LI Y D, MAVROMATIS S, ZHANG F, et al. Single-image super-resolution for remote sensing images using a deep generative adversarial network with local and global attention mechanisms[J]. IEEE Transactions on Geoscience and Remote Sensing202160: 3000224.
[9] LEDIG C, THEIS L, HUSZáR F, et al. Photo-realistic single image super-resolution using a generative adversarial network[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 105-114.
[10] DHARIWAL P, NICHOL A. Diffusion models beat GANS on image synthesis[J]. Advances in Neural Information Processing Systems202134, 8780-8794.
[11] YANG L, LIU J, HONG S, et al. Improving diffusion-based image synthesis with context prediction[C]∥Proceedings of the 37th International Conference on Neural Information Processing Systems, 2024.
[12] HU E J, SHEN Y, WALLIS P, et al. Lora: Low-rank adaptation of large language models[C]∥International Conference on Learning Representations 2022.
[13] ZHANG S, YUAN Q Q, LI J, et al. Scene-adaptive remote sensing image super-resolution using a multiscale attention network[J]. IEEE Transactions on Geoscience and Remote Sensing202058(7): 4764-4779.
[14] PAN Z X, MA W, GUO J Y, et al. Super-resolution of single remote sensing image based on residual dense backprojection networks[J]. IEEE Transactions on Geoscience and Remote Sensing201957(10): 7918-7933.
[15] XIAO Y, SU X, YUAN Q Q, et al. Satellite video super-resolution via multiscale deformable convolution alignment and temporal grouping projection[J]. IEEE Transactions on Geoscience and Remote Sensing202160: 5610819.
[16] LIU Z, LIN Y T, CAO Y, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]∥Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 10012-10022.
[17] XU Y Y, LUO W, HU A N, et al. TE-SAGAN: An improved generative adversarial network for remote sensing super-resolution images[J]. Remote Sensing202214(10): 2425.
[18] HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[J]. Advances in neural information processing systems202033: 6840-6851.
[19] LIU J Z, YUAN Z Q, PAN Z Y, et al. Diffusion model with detail complement for super-resolution of remote sensing[J]. Remote Sensing202214(19): 4834.
[20] 付奕博, 谢东海, 王志博, 等. 基于条件控制扩散模型的遥感图像超分辨率增强算法[J]. 地球信息科学学报202426(10): 2384-2393.
  FU Y B, XIE D H, WANG Z B, et al. A super-resolution enhancement algorithm for remote sensing images using conditional controlled diffusion models[J]. Journal of Geo-Information Science202426(10): 2384-2393 (in Chinese).
[21] HAN L T, ZHAO Y C, LV H Y, et al. Enhancing remote sensing image super-resolution with efficient hybrid conditional diffusion model[J]. Remote Sensing202315(13): 3452.
[22] XIAO Y, YUAN Q Q, JIANG K, et al. EDiffSR: An efficient diffusion probabilistic model for remote sensing image super-resolution[J]. IEEE Transactions on Geoscience and Remote Sensing202362: 5601514.
[23] ALI A M, BENJDIRA B, KOUBAA A, et al. TESR: Two-stage approach for enhancement and super-resolution of remote sensing images[J]. Remote Sensing202315(9): 2346.
[24] ROMBACH R, BLATTMANN A, LORENZ D, et al. High-resolution image synthesis with latent diffusion models[C]∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2022: 10674-10685.
[25] LI X W, SUN A T, ZHAO M K, et al. Multi-intention oriented contrastive learning for sequential recommendation[C]∥Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. New York: ACM, 2023: 411-419.
[26] YE M, ZHANG X, YUEN P C, et al. Unsupervised embedding learning via invariant and spreading instance feature[C]∥2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2019: 6203-6212.
[27] WU H Y, QU Y Y, LIN S H, et al. Contrastive learning for compact single image dehazing[C]∥2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2021: 10551-10560.
[28] WANG Z, LU C, WANG Y, et al. Prolificdreamer: High-fidelity and diverse text-to-3D generation with variational score distillation[J]. Advances in Neural Information Processing Systems202336: 8406-8441.
[29] YIN T W, GHARBI M, ZHANG R, et al. One-step diffusion with distribution matching distillation[C]∥2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 6613-6623.
[30] POOLE B, JAIN A, BARRON J T, et al. Dreamfusion: Text-to-3d using 2D diffusion[J]. arXiv preprint: 2209.14988, 2022.
[31] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint: 1409.1556, 2014.
[32] DING K Y, MA K D, WANG S Q, et al. Image quality assessment: Unifying structure and texture similarity[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence202244(5): 2567-2581.
[33] LI J, CAO J, ZOU Z, et al. Unleashing the power of one-step diffusion based image super-resolution via a large-scale diffusion discriminator[DB/OL]. arXiv preprint: 2410.04224, 2024.
[34] DING J, XUE N, XIA G S, et al. Object detection in aerial images: A large-scale benchmark and challenges[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence202244(11): 7778-7796.
[35] XIA G S, HU J W, HU F, et al. AID: A benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing201755(7): 3965-3981.
[36] CHENG G, ZHOU P C, HAN J W. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing201654(12): 7405-7415.
[37] ROTTENSTEINER F, SOHN G, JUNG J, et al. The ISPRS benchmark on urban object classification and 3D building reconstruction[J]. ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences2012(3): 293-298.
[38] LONG Y, GONG Y P, XIAO Z F, et al. Accurate object localization in remote sensing images based on convolutional neural networks[J]. IEEE Transactions on Geoscience and Remote Sensing201755(5): 2486-2498.
[39] WANG X T, XIE L B, DONG C, et al. Real-ESRGAN: Training real-world blind super-resolution with pure synthetic data[C]∥2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Piscataway: IEEE Press, 2021: 1905-1914.
[40] KINGMA DP. Adam: A method for stochastic optimization[DB/OL]. arXiv preprint: 1412.6980, 2014.
[41] HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. Gans trained by a two time-scale update rule converge to a local Nash equilibrium[J]. Advances in Neural Information Processing Systems201730, 6626-6637.
[42] ZHANG R, ISOLA P, EFROS A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 586-595.
[43] MITTAL A, SOUNDARARAJAN R, BOVIK A C. Making a “completely blind” image quality analyzer[J]. IEEE Signal Processing Letters201320(3): 209-212.
[44] KE J J, WANG Q F, WANG Y L, et al. MUSIQ: Multi-scale image quality transformer[C]∥2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2021: 5128-5137.
[45] WANG J Y, CHAN K C K, LOY C C. Exploring CLIP for assessing the look and feel of images[C]∥Proceedings of the AAAI Conference on Artificial Intelligence, 2023.
[46] YANG S D, WU T H, SHI S W, et al. MANIQA: Multi-dimension attention network for No-reference image quality assessment[C]∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway: IEEE Press, 2022: 1190-1199.
[47] SAHARIA C, HO J, CHAN W, et al. Image super-resolution via iterative refinement[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence202345(4): 4713-4726.
[48] ZHU C, LIU Y, HUANG S, et al. Taming a diffusion model to revitalize remote sensing image super-resolution[J]. Remote Sensing202517(8): 1348.
[49] WANG J, FAN Q, ZHANG Q, et al. Hero-SR: One-step diffusion for super-resolution with human perception priors[J]. arXiv preprint: 2412.07152, 2024.
[50] SHI S W, BAI Q Y, CAO M D, et al. Region-adaptive deformable network for image quality assessment[C]∥2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway: IEEE Press, 2021: 324-333.
[51] LEI S, SHI Z W. Hybrid-scale self-similarity exploitation for remote sensing image super-resolution[J]. IEEE Transactions on Geoscience and Remote Sensing202160: 5401410.
[52] XIAO Y, YUAN Q Q, JIANG K, et al. TTST: A top-k token selective transformer for remote sensing image super-resolution[J]. IEEE Transactions on Image Processing202433: 738-752.
[53] LIANG J Y, CAO J Z, SUN G L, et al. SwinIR: image restoration using swin transformer[C]∥2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Piscataway: IEEE Press, 2021: 1833-1844.
[54] LEI S, SHI Z W, MO W J. Transformer-based multistage enhancement for remote sensing image super-resolution[J]. IEEE Transactions on Geoscience and Remote Sensing202160: 5615611.
[55] MENG F N, CHEN Y J, JING H Y, et al. A conditional diffusion model with fast sampling strategy for remote sensing image super-resolution[J]. IEEE Transactions on Geoscience and Remote Sensing202462: 5408616.
[56] LIN X Q, HE J W, CHEN Z Y, et al. DiffBIR: Toward blind image restoration withGenerative diffusion prior[C]∥Computer Vision-ECCV 2024. Cham: Springer, 2025: 430-448.
Outlines

/