动态亮度重建的无人机可见光-红外融合目标检测-“干扰环境下的无人机多源感知”专栏

  • 刘奎 ,
  • 孙浩 ,
  • 伍瀚 ,
  • 计科峰 ,
  • 匡纲要
展开
  • 1. 国防科技大学电子科学学院电子信息系统复杂电磁环境效应国家重点实验室
    2. 国防科技大学

收稿日期: 2025-03-12

  修回日期: 2025-05-29

  网络出版日期: 2025-06-06

基金资助

国家自然科学基金

Dynamic brightness reconstruction for UAV visible-infrared fusion object detection

  • LIU Kui ,
  • SUN Hao ,
  • WU Han ,
  • JI Ke-Feng ,
  • KUANG Gang-Yao
Expand

Received date: 2025-03-12

  Revised date: 2025-05-29

  Online published: 2025-06-06

摘要

无人机可见光-红外融合目标检测在灾害救援、安防监控和战场侦察等军民用领域具有重要的应用价值。然而在低照度条件下,现有融合策略存在诸多不足,不仅忽略了同一场景内不同区域光照不均衡的问题,还过度依赖红外模态,导致低照度条件下的可见光图像潜在丰富的语义信息未被充分挖掘,此外低照度进一步加剧了跨模态融合的困难。针对以上问题,提出了一种动态亮度重建的无人机可见光-红外融合目标检测方法。首先,利用局部照明信息先验,设计了一个超像素动态照明感知掩膜(Super-Pixel Dynamic Illumination-Aware Mask, SDIM)模块,通过模拟真实场景对不同模态的依赖并引入超像素信息,解决了现有方法存在的物体边缘特征丢失问题。其次,针对低照度可见光图像特征退化问题,设计了低照度图像增强(Low Illumination Image Enhance, LIIE)模块,实现了面向检测任务的端到端优化的可见光图像关键语义自适应增强。最后,针对跨模态特征异构性引发的融合冲突,设计了多尺度特征交叉注意融合(Multi-Scale Feature Cross-Attention Fusion, MFCF)模块,通过层级化交叉注意力机制构建双模态特征交互空间,结合动态权重分配策略,实现了多尺度特征自适应融合。基于典型的可见光-红外融合目标检测数据集DroneVehicle和VEDAI的实验结果,验证了所提方法在低照度条件下可见光-红外融合目标检测任务中的有效性及鲁棒性,具体与现有先进的融合检测算法相比,其保持较低的参数量的同时平均精度(mAP)分别提升了2.3%和2.2%,并且相较于被广泛使用的单模态YOLOv8算法,mAP最高提升了12.9%。此外,基于LIS真实低光数据集的跨场景实验结果,进一步验证了所提方法良好的泛化性。

本文引用格式

刘奎 , 孙浩 , 伍瀚 , 计科峰 , 匡纲要 . 动态亮度重建的无人机可见光-红外融合目标检测-“干扰环境下的无人机多源感知”专栏[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2025.31968

Abstract

The visible-infrared fusion object detection of unmanned aerial vehicles (UAV) has important application value in military and civilian fields such as disaster rescue, security monitoring, and battlefield reconnaissance. However, under low illumination conditions, existing fusion strategies have many shortcomings, not only ignoring the problem of uneven lighting in different areas of the same scene, but also overly relying on infrared modalities, resulting in the potential rich semantic information of visible images in low illumination conditions not being fully explored. In addition, low light further exacerbates the difficulty of cross modal fusion. Aiming at the above problems, a dynamic brightness reconstruction for UAV visible-infrared fusion object detection method is proposed. Firstly, a Super Pixel Dynamic Illumination Aware Mask (SDIM) module was designed using prior local illumination information. By simulating the dependence of real scenes on different modalities and introducing superpixel information, the problem of object edge feature loss in existing methods was solved. Secondly, considering the problem of feature degradation in low light visible images, a Low Illumination Image Enhancement (LIIE) module was designed to achieve end-to-end optimization of visible image key semantic adaptive enhancement for detection tasks. Finally, a Multi-Scale Feature Cross-Attention Fusion (MFCF) module was designed to address the fusion conflicts caused by cross modal feature heterogeneity, the module constructs a bimodal feature interaction space through a hierarchical cross attention mechanism and adaptively fuses multi-scale features using a dynamic weight allocation strategy. Based on the typical visible-infrared fusion object detection datasets DroneVehicle and VEDAI, the experimental results verified the effectiveness and robustness of the proposed method in visible-infrared fusion target detection tasks under low illumination conditions. Specifically, compared with existing advanced fusion detection algorithms, it has improved the average accuracy (mAP) by 2.3% and 2.2% respectively while maintaining a low number of parameters, and compared with the widely used single-mode YOLOv8 algorithm, the mAP has increased by up to 12.9%. In addition, cross scene experimental results based on the LIS real low illumination dataset further validated the excellent generalization capability of the proposed method.

参考文献

[1] SUN Y, CAO B, ZHU P, et al. Drone-Based RGB-Infrared Cross-Modality Vehicle Detection Via Uncer-tainty-Aware Learning[J]. IEEE Transactions on Cir-cuits and Systems for Video Technology, 2022, 32(10): 6700-6713.
[2] ZHU Y, SUN X, WANG M, et al. Multi-Modal Feature Pyramid Transformer for RGB-Infrared Object Detec-tion[J]. IEEE Transactions on Intelligent Transporta-tion Systems, 2023, 24(9): 9984-9995.
[3] 吴一全, 童康. 基于深度学习的无人机航拍图像小目标检测研究进展[J]. 航空学报, 2024: 1-28.
WU Y Q, TONG K. Research advances on deep learn-ing-based small object detection in UAV aerial imag-es[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(6): 030848 (in Chinese).
[4] SHANG X, LI N, LI D, et al. CCLDet: A Cross-Modality and Cross-Domain Low-Light Detector[J]. IEEE Transactions on Intelligent Transportation Sys-tems, 2025: 1-11.
[5] LI C, SONG D, TONG R, et al. Illumination-aware faster R-CNN for robust multispectral pedestrian de-tection[J]. Pattern Recognition, 2019, 85: 161-171.
[6] LIU X, QI J, CHEN C, et al. Relation-Aware Weight Sharing in Decoupling Feature Learning Network for UAV RGB-Infrared Vehicle Re-Identification[J]. IEEE Transactions on Multimedia, 2024, 26: 9839-9853.
[7] WANG J, XU C, ZHAO C, et al. Multimodal Ob-ject Detection of UAV Remote Sensing Based on Joint Representation Optimization and Specific Information Enhancement[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 12364-12373.
[8] YUAN M, SHI X, WANG N, et al. Improving RGB-infrared object detection with cascade alignment-guided transformer[J]. Information Fusion, 2024, 105: 102246.
[9] GUAN D, CAO Y, YANG J, et al. Fusion of multispec-tral data through illumination-aware deep neural net-works for pedestrian detection[J]. Information Fusion, 2019, 50: 148-157.
[10] WANG Q, CHI Y, SHEN T, et al. Improving RGB-Infrared Object Detection by Reducing Cross-Modality Redundancy[J]. Remote Sensing, 2022, 14(9): 2020.
[11] SUN X, YU Y, CHENG Q. Low-Rank Multimodal Remote Sensing Object Detection With Frequency Fil-tering Experts[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 1-14.
[12] ZHAO D, SHAO F, ZHANG S, et al. Advanced Ob-ject Detection in Low-Light Conditions: Enhance-ments to YOLOv7 Framework[J]. Remote Sensing, 2024, 16(23): 4493.
[13] HUI Y, WANG J, LI B. WSA-YOLO: Weak-Supervised and Adaptive Object Detection in the Low-Light Environment for YOLOV7[J]. IEEE Transac-tions on Instrumentation and Measurement, 2024, 73: 1-12.
[14] MORAWSKI I, CHEN Y A, LIN Y S, et al. GenISP: Neural ISP for Low-Light Machine Cogni-tion[C]//2022 IEEE/CVF Conference on Computer Vi-sion and Pattern Recognition Workshops (CVPRW). 2022: 629-638.
[15] ELGUEBALY T, BOUGUILA N. Finite asymmetric generalized Gaussian mixture models learning for in-frared object detection[J]. Computer Vision and Image Understanding, 2013, 117(12): 1659-1671.
[16] 江泽涛, 翟丰硕, 钱艺, 等. 结合特征增强和多尺度感受野的低照度目标检测[J]. 计算机研究与发展, 2023, 60(4): 903-915.
JIANG Z T, ZHAI F S, QIAN Y, et al. Low Illumina-tion Object Detection Combined with Feature En-hancement and MultiScale Receptive Field[J], Journal of Computer Research and Development, 2023, 60(4): 903-915 (in Chinese).
[17] JIA X, ZHU C, LI M, et al. LLVIP: A Visible-infrared Paired Dataset for Low-light Vision[C]//2021 IEEE/CVF International Conference on Computer Vi-sion Workshops (ICCVW). 2021: 3489-3497.
[18] LOH Y P, CHAN C S. Getting to know low-light imag-es with the Exclusively Dark dataset[J]. Computer Vi-sion and Image Understanding, 2019, 178: 30-42.
[19] PENG D, DING W, ZHEN T. A novel low light object detection method based on the YOLOv5 fusion feature enhancement[J]. Scientific Reports, 2024, 14(1): 4486.
[20] LIU S, HE H, ZHANG Z, et al. LI-YOLO: An Ob-ject Detection Algorithm for UAV Aerial Images in Low-Illumination Scenes[J]. Drones, 2024, 8(11): 653.
[21] 江泽涛, 肖芸, 张少钦, 等. 基于Dark-YOLO的低照度目标检测方法[J]. 计算机辅助设计与图形学学报, 2023, 35(3): 441-451.
JIANG Z T, XIAO Y, ZHANG S Q, et al. Low-Illumination Object Detection Method Based on Dark-YOLO [J], Journal of Computer-Aided Design & Computer Graphics, 2023, 35(3): 441-451 (in Chinese).
[22] DU Z, SHIT M, DENG J. Boosting Object Detection with Zero-Shot Day-Night Domain Adapta-tion[C]//2024 IEEE/CVF Conference on Computer Vi-sion and Pattern Recognition (CVPR). 2024: 12666-12676.
[23] WANG W, WANG X, YANG W, et al. Unsupervised Face Detection in the Dark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 1250-1266.
[24] SASAGAWA Y, NAGAHARA H. YOLO in the Dark - Domain Adaptation Method for Merging Multiple Models[C]//VEDALDI A, BISCHOF H, BROX T, et al. Computer Vision – ECCV 2020. Cham: Springer In-ternational Publishing, 2020: 345-359.
[25] GUO C, LI C, GUO J, et al. Zero-Reference Deep Curve Estimation for Low-Light Image Enhance-ment[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020: 1777-1786.
[26] ZHANG L, LIU Z, ZHU X, et al. Weakly Aligned Feature Fusion for Multimodal Object Detection[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021: 1-15.
[27] LIU Y, JIANG W. Frequency Mining and Complemen-tary Fusion Network for RGB-Infrared Object Detec-tion[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 1-5.
[28] RAZAKARIVONY S, JURIE F. Vehicle detection in aerial imagery?: A small target detection benchmark[J]. Journal of Visual Communication and Image Repre-sentation, 2016, 34: 187-203.
[29] ZHANG J, LEI J, XIE W, et al. SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 1-15.
[30] JIN S, YU B, JING M, et al. DarkVisionNet: Low-Light Imaging via RGB-NIR Fusion with Deep Incon-sistency Prior[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(1): 1104-1112.
[31] SHARMA M, DHANARAJ M, KARNAM S, et al. YOLOrs: Object Detection in Multimodal Remote Sensing Imagery[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14: 1497-1508.
[32] CHEN J, DING J, MA J. HitFusion: Infrared and Visi-ble Image Fusion for High-Level Vision Tasks Using Transformer[J]. IEEE Transactions on Multimedia, 2024, 26: 10145-10159.
[33] 李峻宇, 刘乾坤, 付莹. 融合注意力机制的红外小目标检测[J]. 航空学报, 2024, 45(14): 90-101.
LI J Y,LIU Q K,FU Y. Infrared smallobjectdetection based on attention mechanism[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(14): 628959 (inChinese).
[34] ACHANTA R, SHAJI A, SMITH K, et al. SLIC Super-pixels Compared to State-of-the-Art Superpixel Meth-ods[J]. IEEE Transactions on Pattern Analysis and Ma-chine Intelligence, 2012, 34(11): 2274-2282.
[35] HE K, ZHANG X, REN S, et al. Deep Residual Learn-ing for Image Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016: 770-778.
[36] HAN J, DING J, LI J, et al. Align Deep Features for Oriented Object Detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-11.
[37] YANG Z, LIU S, HU H, et al. RepPoints: Point Set Representation for Object Detection[C]//2019 IEEE/CVF International Conference on Computer Vi-sion (ICCV). 2019: 9656-9665.
[38] YANG X, YAN J, FENG Z, et al. R3Det: Refined Sin-gle-Stage Detector with Feature Refinement for Rotat-ing Object[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(4): 3163-3171.
[39] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[40] XIE X, CHENG G, WANG J, et al. Oriented R-CNN for Object Detection[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). 2021: 3500-3509.
[41] JOCHER G, QIU J, CHAURASIA A. Ultralytics YOLO[CP]. (2023-01).
文章导航

/