航空学报 > 2024, Vol. 45 Issue (14): 629490-629490   doi: 10.7527/S1000-6893.2024.29490

基于多层多向Transformer的红外弱小目标检测

王潇, 刘贞报()   

  1. 西北工业大学 民航学院,西安 710072
  • 收稿日期:2023-08-30 修回日期:2023-10-30 接受日期:2024-01-08 出版日期:2024-07-25 发布日期:2024-01-24
  • 通讯作者: 刘贞报 E-mail:liuzhenbao@nwpu.edu.cn
  • 基金资助:
    国家自然科学基金(52072309);陕西省重点研发计划(2019ZDLGY14-02-01);深圳市基础研究资助项目(JCYJ201908061522);航空科学基金(ASFC-2018ZC53026)

Infrared small target detection based on multi⁃layer multi⁃direction transformer

Xiao WANG, Zhenbao LIU()   

  1. School of Civil Aviation,Northwestern Polytechnical University,Xi’an 710072,China
  • Received:2023-08-30 Revised:2023-10-30 Accepted:2024-01-08 Online:2024-07-25 Published:2024-01-24
  • Contact: Zhenbao LIU E-mail:liuzhenbao@nwpu.edu.cn
  • Supported by:
    National Natural Science Foundation of China(52072309);Key Research and Development Program of Shaanxi(2019ZDLGY14-02-01);Shenzhen Fundamental Research Program(JCYJ20190806152203506);Aeronautical Science Foundation of China(ASFC-2018ZC53026)

摘要:

针对基于卷积神经网络的红外图像弱小目标检测方法面临卷积神经网络的感受野有限,扩展感受野的下采样操作容易导致特征丢失,以及卷积网络对局部对比特征提取能力有限的问题,提出了一种基于多层多向Transformer的红外图像弱小目标检测算法。首先使用感受野较大、对比特征提取能力较强的Transformer网络作为基本单元,设计U型深度神经网络和多特征层融合网络将局部特征以及全局特征进行融合。同时在解码网络设计双向注意力算子,利用自注意力计算机制分别计算空间方向和特征方向的注意力特征,进一步提高深度网络提取弱小目标和周边区域对比特征的能力。另外在主干网络最后添加数量约束网络,通过统计对比检测结果中目标的数量,减小误检目标的数目,提高目标检测的准确度。最后在多个实验数据集上和已有方法进行了对比实验,在各个评价参数上取得了最多35%的提升,证明所提红外弱小目标检测算法具有较高的准确性。

关键词: 红外弱小目标检测, Transformer, 多特征层融合, 双向注意力算子, 数量约束

Abstract:

The convolution neural network based infrared small target detection suffers from the problems of limited receptive field of convolution kernel, information loss caused by down sampling operation, and limited power of the convolution neural network in relative information extraction. To solve these problems, a multi-layer multi-direction Transformer based neural network is proposed. Firstly, the Transformer block is adopted as the basic operator since it has a larger receptive field and more powerful in extracting relative information. The proposed network is a U-shaped network, and fuses local and global information with multi-layers structure. Meanwhile, to enhance the network’s ability to detect the infrared small target, a dual-direction attention operator which calculates the attention information along spatial and channel directions is designed for the decoder network. Finally, an additional network is added to the backbone network to calculate the number of the detected infrared small targets. This additional network reduces the number of falsely detected targets by comparing the calculated number with ground truth. The proposed method is tested on several datasets and the evaluation metrics in comparison with state-of-the-art methods. The proposed method achieves an improvement by 35% at most, which proves the effectiveness of the proposed method.

Key words: infrared small target detection, Transformer, multi-layers fusion, dual-direction attention operator, number supervision

中图分类号: