ACTA AERONAUTICAET ASTRONAUTICA SINICA >
Dynamic brightness reconstruction for UAV visible-infrared fusion object detection
Received date: 2025-03-12
Revised date: 2025-03-29
Accepted date: 2025-05-28
Online published: 2025-06-06
Supported by
National Natural Science Foundation of China(61971426)
The visible-infrared fusion object detection of Unmanned Aerial Vehicles (UAV) has important application value in military and civilian fields such as disaster rescue, security monitoring, and battlefield reconnaissance. However, under low illumination conditions, existing fusion strategies have several limitations, including ignoring uneven lighting in different areas of the same scene and over -reliance on infrared modalities, which results in the potential rich semantic information of visible images in low illumination conditions. In addition, low light further exacerbates the difficulty of cross modal fusion. To address the above problems, a dynamic brightness reconstruction for UAV visible-infrared fusion object detection method is proposed. Firstly, a Super pixel Dynamic Illumination aware Mask (SDIM) module was designed using prior local illumination information. By simulating the dependence of real scenes on different modalities and introducing superpixel information, the problem of object edge feature loss in existing methods was solved. Secondly, considering the problem of feature degradation in low light visible images, a Low Illumination Image Enhancement (LIIE) module was designed to achieve end-to-end optimization of visible image key semantic adaptive enhancement for detection tasks. Finally, a Multi-Scale Feature Cross-Attention Fusion (MFCF) module was designed to address the fusion conflicts caused by cross modal feature heterogeneity. The module constructs a bimodal feature interaction space through a hierarchical cross attention mechanism and adaptively fuses multi-scale features using a dynamic weight allocation strategy. Based on the typical visible-infrared fusion object detection datasets DroneVehicle and VEDAI, the experimental results verified the effectiveness and robustness of the proposed method in visible-infrared fusion target detection tasks under low illumination conditions. Specifically, compared with existing advanced fusion detection algorithms, the proposed method improved the average accuracy (mAP) by 2.3% and 2.2% respectively while maintaining a low number of parameters, and compared with the widely used single-mode YOLOv8 algorithm, the mAP has increased by up to 12.9%. In addition, cross scene experimental results based on the LIS real low illumination dataset further validated the excellent generalization capability of the proposed method.
Key words: low illumination; UAV; visible-infrared; deep learning; object detection
Kui LIU , Hao SUN , Han WU , Kefeng JI , Gangyao KUANG . Dynamic brightness reconstruction for UAV visible-infrared fusion object detection[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2025 , 46(23) : 631968 -631968 . DOI: 10.7527/S1000-6893.2025.31968
| [1] | SUN Y M, CAO B, ZHU P F, et al. Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(10): 6700-6713. |
| [2] | ZHU Y H, SUN X Y, WANG M, et al. Multi-modal feature pyramid transformer for RGB-infrared object detection[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(9): 9984-9995. |
| [3] | 吴一全, 童康. 基于深度学习的无人机航拍图像小目标检测研究进展[J]. 航空学报, 2025, 46(3): 181-207. |
| WU Y Q, TONG K. Research advances on deep learning-based small object detection in UAV aerial images[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(3): 181-207 (in Chinese). | |
| [4] | SHANG X P, LI N N, LI D J, et al. CCLDet: A cross-modality and cross-domain low-light detector[J]. IEEE Transactions on Intelligent Transportation Systems, 2025, 26(3): 3284-3294. |
| [5] | LI C Y, SONG D, TONG R F, et al. Illumination-aware faster R-CNN for robust multispectral pedestrian detection[J]. Pattern Recognition, 2019, 85: 161-171. |
| [6] | LIU X Y, QI J H, CHEN C, et al. Relation-aware weight sharing in decoupling feature learning network for UAV RGB-infrared vehicle re-identification[J]. IEEE Transactions on Multimedia, 2024, 26: 9839-9853. |
| [7] | WANG J P, XU C A, ZHAO C H, et al. Multimodal object detection of UAV remote sensing based on joint representation optimization and specific information enhancement[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 12364-12373. |
| [8] | YUAN M X, SHI X R, WANG N, et al. Improving RGB-infrared object detection with cascade alignment-guided transformer[J]. Information Fusion, 2024, 105: 102246. |
| [9] | GUAN D Y, CAO Y P, YANG J X, et al. Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection[J]. Information Fusion, 2019, 50: 148-157. |
| [10] | WANG Q W, CHI Y K, SHEN T, et al. Improving RGB-infrared object detection by reducing cross-modality redundancy[J]. Remote Sensing, 2022, 14(9): 2020. |
| [11] | SUN X, YU Y H, CHENG Q. Low-rank multimodal remote sensing object detection with frequency filtering experts[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5637114. |
| [12] | ZHAO D W, SHAO F M, ZHANG S, et al. Advanced object detection in low-light conditions: Enhancements to YOLOv7 framework[J]. Remote Sensing, 2024, 16(23): 4493. |
| [13] | HUI Y M, WANG J, LI B. WSA-YOLO: Weak-supervised and adaptive object detection in the low-light environment for YOLOV7[J]. IEEE Transactions on Instrumentation and Measurement, 2024, 73: 2507012. |
| [14] | MORAWSKI I, CHEN Y, LIN Y S, et al. GenISP: Neural ISP for low-light machine cognition[C]∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway: IEEE Press, 2022. |
| [15] | ELGUEBALY T, BOUGUILA N. Finite asymmetric generalized Gaussian mixture models learning for infrared object detection[J]. Computer Vision and Image Understanding, 2013, 117(12): 1659-1671. |
| [16] | 江泽涛, 翟丰硕, 钱艺, 等. 结合特征增强和多尺度感受野的低照度目标检测[J]. 计算机研究与发展, 2023, 60(4): 903-915. |
| JIANG Z T, ZHAI F S, QIAN Y, et al. Low illumination object detection combined with feature enhancement and MultiScale receptive field[J]. Journal of Computer Research and Development, 2023, 60(4): 903-915 (in Chinese). | |
| [17] | JIA X Y, ZHU C, LI M Z, et al. LLVIP: A visible-infrared paired dataset for low-light vision[C]∥2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Piscataway: IEEE Press, 2021: 3489-3497. |
| [18] | LOH Y P, CHAN C S. Getting to know low-light images with the Exclusively Dark dataset[J]. Computer Vision and Image Understanding, 2019, 178: 30-42. |
| [19] | PENG D X, DING W, ZHEN T. A novel low light object detection method based on the YOLOv5 fusion feature enhancement[J]. Scientific Reports, 2024, 14: 4486. |
| [20] | LIU S W, HE H, ZHANG Z C, et al. LI-YOLO: An object detection algorithm for UAV aerial images in low-illumination scenes[J]. Drones, 2024, 8(11): 653. |
| [21] | 江泽涛, 肖芸, 张少钦, 等. 基于Dark-YOLO的低照度目标检测方法[J]. 计算机辅助设计与图形学学报, 2023, 35(3): 441-451. |
| JIANG Z T, XIAO Y, ZHANG S Q, et al. Low-illumination object detection method based on dark-YOLO[J]. Journal of Computer-Aided Design & Computer Graphics, 2023, 35(3): 441-451 (in Chinese). | |
| [22] | DU Z P, SHIT M, DENG J K. Boosting object detection with zero-shot day-night domain adaptation[C]∥2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 12666-12676. |
| [23] | WANG W J, WANG X H, YANG W H, et al. Unsupervised face detection in the dark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 1250-1266. |
| [24] | SASAGAWA Y, NAGAHARA H. YOLO in the dark-domain adaptation method for merging multiple models[C]∥Computer Vision-ECCV 2020. Cham: Springer International Publishing, 2020: 345-359. |
| [25] | GUO C L, LI C Y, GUO J C, et al. Zero-reference deep curve estimation for low-light image enhancement[C]∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020. |
| [26] | ZHANG L, LIU Z Y, ZHU X Y, et al. Weakly aligned feature fusion for multimodal object detection[J]. IEEE Transactions on Neural Networks and Learning Systems, 2025, 36(3): 4145-4159. |
| [27] | LIU Y, JIANG W S. Frequency mining and complementary fusion network for RGB-infrared object detection[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 5004605. |
| [28] | RAZAKARIVONY S, JURIE F. Vehicle detection in aerial imagery: A small target detection benchmark[J]. Journal of Visual Communication and Image Representation, 2016, 34: 187-203. |
| [29] | ZHANG J Q, LEI J, XIE W Y, et al. SuperYOLO: Super resolution assisted object detection in multimodal remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5605415. |
| [30] | JIN S P, YU B B, JING M H, et al. DarkVisionNet: Low-light imaging via RGB-NIR fusion with deep inconsistency prior[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(1): 1104-1112. |
| [31] | SHARMA M, DHANARAJ M, KARNAM S, et al. YOLOrs: Object detection in multimodal remote sensing imagery[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 14: 1497-1508. |
| [32] | CHEN J, DING J F, MA J Y. HitFusion: Infrared and visible image fusion for high-level vision tasks using transformer[J]. IEEE Transactions on Multimedia, 2024, 26: 10145-10159. |
| [33] | 李峻宇, 刘乾坤, 付莹. 融合注意力机制的红外小目标检测[J]. 航空学报, 2024, 45(14): 628959. |
| LI J Y, LIU Q K, FU Y. Infrared small object detection based on attention mechanism[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(14): 628959 (in Chinese). | |
| [34] | ACHANTA R, SHAJI A, SMITH K, et al. SLIC superpixels compared to state-of-the-art superpixel methods[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(11): 2274-2282. |
| [35] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016. |
| [36] | CHEN L W, GU L, LI L, et al. Frequency dynamic convolution for dense image prediction[C]∥2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2025. |
| [37] | CHEN L W, FU Y, WEI K X, et al. Instance segmentation in the dark[J]. International Journal of Computer Vision, 2023, 131(8): 2198-2218. |
| [38] | HAN J M, DING J, LI J, et al. Align deep features for oriented object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 5602511. |
| [39] | YANG Z, LIU S H, HU H, et al. RepPoints: Point set representation for object detection[C]∥2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2019: 9656-9665. |
| [40] | YANG X, YAN J C, FENG Z M, et al. R3Det: Refined single-stage detector with feature refinement for rotating object[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(4): 3163-3171. |
| [41] | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. |
| [42] | XIE X X, CHENG G, WANG J B, et al. Oriented R-CNN for object detection[C]∥2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2021: 3500-3509. |
| [43] | JOCHER G, QIU J, CHAURASIA A. Ultralytics YOLO[EB/OL]. (2023-01) [2025-02-28]. . |
| [44] | ZHANG H, LI F, LIU S L, et al. DINO: DETR with improved denoising anchor boxes for end-to-end object detection[DB/OL]. arXiv: , 2022. |
| [45] | TIAN Z, SHEN C H, CHEN H, et al. FCOS: Fully convolutional one-stage object detection[C]∥2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2019: 9626-9635. |
| [46] | CHEN L W, FU Y, GU L, et al. Frequency-aware feature fusion for dense image prediction[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(12): 10763-10780. |
| [47] | CHEN L W, GU L, ZHENG D Z, et al. Frequency-adaptive dilated convolution for semantic segmentation[C]∥2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2024: 3414-3425. |
| [48] | SHEN J F, CHEN Y F, LIU Y, et al. ICAFusion: Iterative cross-attention guided feature fusion for multispectral object detection[J]. Pattern Recognition, 2024, 145: 109913. |
| [49] | ZHOU K L, CHEN L S, CAO X. Improving multispectral pedestrian detection by addressing modality imbalance problems[C]∥Computer Vision-ECCV 2020. Cham: Springer, 2020: 787-803. |
| [50] | ZHANG J Q, CAO M X, XIE W Y, et al. E2E-MFD: Towards end-to-end synchronous multimodal fusion detection[C]∥NIPS’24: Proceedings of the 38th International Conference on Neural Information Processing Systems. Curran Associates Inc., 2024: 52296-52322. |
| [51] | YUAN M X, WEI X X. C2Former: Calibrated and complementary transformer for RGB-infrared object detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5403712. |
/
| 〈 |
|
〉 |