special column

Dynamic brightness reconstruction for UAV visible-infrared fusion object detection

  • Kui LIU ,
  • Hao SUN ,
  • Han WU ,
  • Kefeng JI ,
  • Gangyao KUANG
Expand
  • 1.College of Electronic Science and Technology,National University of Defense Technology,Changsha 410073,China
    2.State Key Laboratory of Complex Electromagnetic Environment Effects on Electronics and Information System,National University of Defense Technology,Changsha 410073,China

Received date: 2025-03-12

  Revised date: 2025-03-29

  Accepted date: 2025-05-28

  Online published: 2025-06-06

Supported by

National Natural Science Foundation of China(61971426)

Abstract

The visible-infrared fusion object detection of Unmanned Aerial Vehicles (UAV) has important application value in military and civilian fields such as disaster rescue, security monitoring, and battlefield reconnaissance. However, under low illumination conditions, existing fusion strategies have several limitations, including ignoring uneven lighting in different areas of the same scene and over -reliance on infrared modalities, which results in the potential rich semantic information of visible images in low illumination conditions. In addition, low light further exacerbates the difficulty of cross modal fusion. To address the above problems, a dynamic brightness reconstruction for UAV visible-infrared fusion object detection method is proposed. Firstly, a Super pixel Dynamic Illumination aware Mask (SDIM) module was designed using prior local illumination information. By simulating the dependence of real scenes on different modalities and introducing superpixel information, the problem of object edge feature loss in existing methods was solved. Secondly, considering the problem of feature degradation in low light visible images, a Low Illumination Image Enhancement (LIIE) module was designed to achieve end-to-end optimization of visible image key semantic adaptive enhancement for detection tasks. Finally, a Multi-Scale Feature Cross-Attention Fusion (MFCF) module was designed to address the fusion conflicts caused by cross modal feature heterogeneity. The module constructs a bimodal feature interaction space through a hierarchical cross attention mechanism and adaptively fuses multi-scale features using a dynamic weight allocation strategy. Based on the typical visible-infrared fusion object detection datasets DroneVehicle and VEDAI, the experimental results verified the effectiveness and robustness of the proposed method in visible-infrared fusion target detection tasks under low illumination conditions. Specifically, compared with existing advanced fusion detection algorithms, the proposed method improved the average accuracy (mAP) by 2.3% and 2.2% respectively while maintaining a low number of parameters, and compared with the widely used single-mode YOLOv8 algorithm, the mAP has increased by up to 12.9%. In addition, cross scene experimental results based on the LIS real low illumination dataset further validated the excellent generalization capability of the proposed method.

Cite this article

Kui LIU , Hao SUN , Han WU , Kefeng JI , Gangyao KUANG . Dynamic brightness reconstruction for UAV visible-infrared fusion object detection[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2025 , 46(23) : 631968 -631968 . DOI: 10.7527/S1000-6893.2025.31968

References

[1] SUN Y M, CAO B, ZHU P F, et al. Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning[J]. IEEE Transactions on Circuits and Systems for Video Technology202232(10): 6700-6713.
[2] ZHU Y H, SUN X Y, WANG M, et al. Multi-modal feature pyramid transformer for RGB-infrared object detection[J]. IEEE Transactions on Intelligent Transportation Systems202324(9): 9984-9995.
[3] 吴一全, 童康. 基于深度学习的无人机航拍图像小目标检测研究进展[J]. 航空学报202546(3): 181-207.
  WU Y Q, TONG K. Research advances on deep learning-based small object detection in UAV aerial images[J]. Acta Aeronautica et Astronautica Sinica202546(3): 181-207 (in Chinese).
[4] SHANG X P, LI N N, LI D J, et al. CCLDet: A cross-modality and cross-domain low-light detector[J]. IEEE Transactions on Intelligent Transportation Systems202526(3): 3284-3294.
[5] LI C Y, SONG D, TONG R F, et al. Illumination-aware faster R-CNN for robust multispectral pedestrian detection[J]. Pattern Recognition201985: 161-171.
[6] LIU X Y, QI J H, CHEN C, et al. Relation-aware weight sharing in decoupling feature learning network for UAV RGB-infrared vehicle re-identification[J]. IEEE Transactions on Multimedia202426: 9839-9853.
[7] WANG J P, XU C A, ZHAO C H, et al. Multimodal object detection of UAV remote sensing based on joint representation optimization and specific information enhancement[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing202417: 12364-12373.
[8] YUAN M X, SHI X R, WANG N, et al. Improving RGB-infrared object detection with cascade alignment-guided transformer[J]. Information Fusion2024105: 102246.
[9] GUAN D Y, CAO Y P, YANG J X, et al. Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection[J]. Information Fusion201950: 148-157.
[10] WANG Q W, CHI Y K, SHEN T, et al. Improving RGB-infrared object detection by reducing cross-modality redundancy[J]. Remote Sensing202214(9): 2020.
[11] SUN X, YU Y H, CHENG Q. Low-rank multimodal remote sensing object detection with frequency filtering experts[J]. IEEE Transactions on Geoscience and Remote Sensing202462: 5637114.
[12] ZHAO D W, SHAO F M, ZHANG S, et al. Advanced object detection in low-light conditions: Enhancements to YOLOv7 framework[J]. Remote Sensing202416(23): 4493.
[13] HUI Y M, WANG J, LI B. WSA-YOLO: Weak-supervised and adaptive object detection in the low-light environment for YOLOV7[J]. IEEE Transactions on Instrumentation and Measurement202473: 2507012.
[14] MORAWSKI I, CHEN Y, LIN Y S, et al. GenISP: Neural ISP for low-light machine cognition[C]∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway: IEEE Press, 2022.
[15] ELGUEBALY T, BOUGUILA N. Finite asymmetric generalized Gaussian mixture models learning for infrared object detection[J]. Computer Vision and Image Understanding2013117(12): 1659-1671.
[16] 江泽涛, 翟丰硕, 钱艺, 等. 结合特征增强和多尺度感受野的低照度目标检测[J]. 计算机研究与发展202360(4): 903-915.
  JIANG Z T, ZHAI F S, QIAN Y, et al. Low illumination object detection combined with feature enhancement and MultiScale receptive field[J]. Journal of Computer Research and Development202360(4): 903-915 (in Chinese).
[17] JIA X Y, ZHU C, LI M Z, et al. LLVIP: A visible-infrared paired dataset for low-light vision[C]∥2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Piscataway: IEEE Press, 2021: 3489-3497.
[18] LOH Y P, CHAN C S. Getting to know low-light images with the Exclusively Dark dataset[J]. Computer Vision and Image Understanding2019178: 30-42.
[19] PENG D X, DING W, ZHEN T. A novel low light object detection method based on the YOLOv5 fusion feature enhancement[J]. Scientific Reports202414: 4486.
[20] LIU S W, HE H, ZHANG Z C, et al. LI-YOLO: An object detection algorithm for UAV aerial images in low-illumination scenes[J]. Drones20248(11): 653.
[21] 江泽涛, 肖芸, 张少钦, 等. 基于Dark-YOLO的低照度目标检测方法[J]. 计算机辅助设计与图形学学报202335(3): 441-451.
  JIANG Z T, XIAO Y, ZHANG S Q, et al. Low-illumination object detection method based on dark-YOLO[J]. Journal of Computer-Aided Design & Computer Graphics202335(3): 441-451 (in Chinese).
[22] DU Z P, SHIT M, DENG J K. Boosting object detection with zero-shot day-night domain adaptation[C]∥2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 12666-12676.
[23] WANG W J, WANG X H, YANG W H, et al. Unsupervised face detection in the dark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence202345(1): 1250-1266.
[24] SASAGAWA Y, NAGAHARA H. YOLO in the dark-domain adaptation method for merging multiple models[C]∥Computer Vision-ECCV 2020. Cham: Springer International Publishing, 2020: 345-359.
[25] GUO C L, LI C Y, GUO J C, et al. Zero-reference deep curve estimation for low-light image enhancement[C]∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020.
[26] ZHANG L, LIU Z Y, ZHU X Y, et al. Weakly aligned feature fusion for multimodal object detection[J]. IEEE Transactions on Neural Networks and Learning Systems202536(3): 4145-4159.
[27] LIU Y, JIANG W S. Frequency mining and complementary fusion network for RGB-infrared object detection[J]. IEEE Geoscience and Remote Sensing Letters202421: 5004605.
[28] RAZAKARIVONY S, JURIE F. Vehicle detection in aerial imagery: A small target detection benchmark[J]. Journal of Visual Communication and Image Representation201634: 187-203.
[29] ZHANG J Q, LEI J, XIE W Y, et al. SuperYOLO: Super resolution assisted object detection in multimodal remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing202361: 5605415.
[30] JIN S P, YU B B, JING M H, et al. DarkVisionNet: Low-light imaging via RGB-NIR fusion with deep inconsistency prior[J]. Proceedings of the AAAI Conference on Artificial Intelligence202236(1): 1104-1112.
[31] SHARMA M, DHANARAJ M, KARNAM S, et al. YOLOrs: Object detection in multimodal remote sensing imagery[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing202014: 1497-1508.
[32] CHEN J, DING J F, MA J Y. HitFusion: Infrared and visible image fusion for high-level vision tasks using transformer[J]. IEEE Transactions on Multimedia202426: 10145-10159.
[33] 李峻宇, 刘乾坤, 付莹. 融合注意力机制的红外小目标检测[J]. 航空学报202445(14): 628959.
  LI J Y, LIU Q K, FU Y. Infrared small object detection based on attention mechanism[J]. Acta Aeronautica et Astronautica Sinica202445(14): 628959 (in Chinese).
[34] ACHANTA R, SHAJI A, SMITH K, et al. SLIC superpixels compared to state-of-the-art superpixel methods[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence201234(11): 2274-2282.
[35] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016.
[36] CHEN L W, GU L, LI L, et al. Frequency dynamic convolution for dense image prediction[C]∥2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2025.
[37] CHEN L W, FU Y, WEI K X, et al. Instance segmentation in the dark[J]. International Journal of Computer Vision2023131(8): 2198-2218.
[38] HAN J M, DING J, LI J, et al. Align deep features for oriented object detection[J]. IEEE Transactions on Geoscience and Remote Sensing202160: 5602511.
[39] YANG Z, LIU S H, HU H, et al. RepPoints: Point set representation for object detection[C]∥2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2019: 9656-9665.
[40] YANG X, YAN J C, FENG Z M, et al. R3Det: Refined single-stage detector with feature refinement for rotating object[J]. Proceedings of the AAAI Conference on Artificial Intelligence202135(4): 3163-3171.
[41] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence201739(6): 1137-1149.
[42] XIE X X, CHENG G, WANG J B, et al. Oriented R-CNN for object detection[C]∥2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2021: 3500-3509.
[43] JOCHER G, QIU J, CHAURASIA A. Ultralytics YOLO[EB/OL]. (2023-01) [2025-02-28]. .
[44] ZHANG H, LI F, LIU S L, et al. DINO: DETR with improved denoising anchor boxes for end-to-end object detection[DB/OL]. arXiv: , 2022.
[45] TIAN Z, SHEN C H, CHEN H, et al. FCOS: Fully convolutional one-stage object detection[C]∥2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2019: 9626-9635.
[46] CHEN L W, FU Y, GU L, et al. Frequency-aware feature fusion for dense image prediction[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence202446(12): 10763-10780.
[47] CHEN L W, GU L, ZHENG D Z, et al. Frequency-adaptive dilated convolution for semantic segmentation[C]∥2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 3414-3425.
[48] SHEN J F, CHEN Y F, LIU Y, et al. ICAFusion: Iterative cross-attention guided feature fusion for multispectral object detection[J]. Pattern Recognition2024145: 109913.
[49] ZHOU K L, CHEN L S, CAO X. Improving multispectral pedestrian detection by addressing modality imbalance problems[C]∥Computer Vision-ECCV 2020. Cham: Springer, 2020: 787-803.
[50] ZHANG J Q, CAO M X, XIE W Y, et al. E2E-MFD: Towards end-to-end synchronous multimodal fusion detection[C]∥NIPS’24: Proceedings of the 38th International Conference on Neural Information Processing Systems. Curran Associates Inc., 2024: 52296-52322.
[51] YUAN M X, WEI X X. C2Former: Calibrated and complementary transformer for RGB-infrared object detection[J]. IEEE Transactions on Geoscience and Remote Sensing202462: 5403712.
Outlines

/