航空学报 > 2024, Vol. 45 Issue (14): 628959-628959

融合注意力机制的红外小目标检测

李峻宇1, 刘乾坤1, 付莹1,2()   

  1. 1.北京理工大学 复杂环境智能感测技术工业和信息化部重点实验室,北京 100081
    2.北京理工大学长三角研究院,嘉兴 314019
  • 收稿日期:2023-05-03 修回日期:2023-05-30 接受日期:2023-07-03 出版日期:2024-07-25 发布日期:2024-06-17
  • 通讯作者: 付莹 E-mail:fuying@bit.edu.cn
  • 基金资助:
    国家自然科学基金(62171038);北京市教委-市自然科学基金联合项目(KZ202211417048)

Infrared small object detection based on attention mechanism

Junyu LI1, Qiankun LIU1, Ying FU1,2()   

  1. 1.MIIT Key Laboratory of Complex-field Intelligent Sensing,Beijing Institute of Technology,Beijing 100081,China
    2.Yangtze Delta Region Academy of Beijing Institute of Technology,Jiaxing 314019,China
  • Received:2023-05-03 Revised:2023-05-30 Accepted:2023-07-03 Online:2024-07-25 Published:2024-06-17
  • Contact: Ying FU E-mail:fuying@bit.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62171038);The R&D Program of Beijing Municipal Education Commission(KZ202211417048)

摘要:

随着图像中显著目标检测准确率的提高,如何提升小目标的检测精度逐渐成为人们关注的重点。现有的目标检测方法主要研究以可见光图像作为输入的通用目标检测问题,小目标检测领域的大部分方法主要针对可见光图像,面向红外图像的较少。红外小目标不含颜色信息,与常规目标尺度差别大,更加依赖上下文信息。针对这些问题,提出一种基于YOLOv5的红外小目标检测模型。在标准的YOLOv5模型基础上,为了有效结合目标周围的局部信息和整体特征中的全局信息,同时适应红外小目标的细微形态变化,提出了动态上下文信息提取模块,引入通道-细节注意力模块汇聚红外小目标的通道信息和细节信息,提高回归精度。考虑到网络卷积过程中细节特征丢失的问题,在保证模型特征尺度相对应的情况下,上采样新的特征尺度与浅层特征融合,以捕捉更多红外小目标细节信息,避免特征混叠。为了证明方法的有效性,在公开的红外数据集ITTD、IRSTD-1k和NUAA-SIRST上进行验证。实验结果表明:在ITTD数据集中所提方法的mAP值超过对比方法5.1%。对比YOLOv5s基准模型,mAP值提高了3.7%,在IRSTD-1k和NUAA-SIRST数据集中也展示出良好的检测效果,并对自身模型进行了消融实验。本文所提出的红外小目标检测模型对复杂场景下的红外小目标鲁棒性很好,有效地提高了小目标检测的精度,降低了小目标的漏检率。

关键词: 深度学习, 目标检测, 红外小目标检测, 注意力机制, YOLOv5

Abstract:

As the detection accuracy of salient objects in the image has been improved, the focus of research has gradually shifted to how to improve the accuracy of small object detection. However, existing object detection methods mainly study the general object detection using visible images as the input, and most small object detection methods are designed for visible images, leaving the small object detection in infrared images underexplored. Compared with standard scale objects, infrared small objects lack color information, which makes them more dependent on contextual information. In this paper, an infrared small object detection model is proposed based on the standard YOLOv5 model. The local information around small objects is effectively combined with global information by a Dynamic Contextual Information Extraction Module, which adapts to the subtle morphological changes of infrared small objects dynamically. A Channel-Detail Attention Module is designed to aggregate the channel and detail information of the infrared small objects to improve the accuracy of the regression. Considering the problem of loss of detailed features in the process of network convolution, features with new scales are upsampled and fused with shallow features to capture more detailed information of infrared small objects and avoid feature blending. To demonstrate the effectiveness of the proposed method, experiments are conducted on the public infrared datasets, including ITTD, IRSTD-1k, and NUAA-SIRST. The experimental results show that the proposed method outperforms the compared methods in terms of mAP by 5.1% on the ITTD dataset, and the mAP is also improved by 3.7% compared to that of the baseline method (i.e., YOLOv5). Results on the IRSTD-1k and NUAA-SIRST datasets also demonstrate the effectiveness of our design. An ablation study is performed to verify the effectiveness of different modules. The proposed infrared small object detection model is robust to the infrared small objects in complex backgrounds, which improves the accuracy and reduces the false alarm rate of infrared small object detection effectively.

Key words: deep learning, object detection, infrared small object detection, attention mechanism, YOLOv5

中图分类号: