基于多阶段蒸馏的无人机图像时敏目标增量检测算法 备注:“干扰环境下的无人机多源感知”专栏 (1)无人机图像目标识别与跟踪

  • 成桢灏 ,
  • 杨小冈 ,
  • 卢瑞涛 ,
  • 张涛 ,
  • 王思宇
展开
  • 1. 火箭军工程大学
    2. 火箭军工程大学 导弹工程学院

收稿日期: 2025-03-11

  修回日期: 2025-06-29

  网络出版日期: 2025-07-03

基金资助

红外远距离弱小舰群目标智能识别与跟踪方法;复杂干扰条件下视觉认知导航无人机系统开发

Multi-Stage Distillation for Incremental Detection of Time-Sensitive Targets in UAV Images

  • CHENG Zhen-Hao ,
  • YANG Xiao-Gang ,
  • LU Rui-Tao ,
  • ZHANG Tao ,
  • WANG Si-Yu
Expand

Received date: 2025-03-11

  Revised date: 2025-06-29

  Online published: 2025-07-03

摘要

针对当前无人机图像时敏目标类增量检测面临的灾难性遗忘、过拟合以及难以适配密集检测器特性导致检测精度受限等问题,本文方法提出了一种基于多阶段蒸馏的时敏目标增量检测算法,算法主要包含基于连续Wasserstein距离的类间蒸馏(WICD)模块,基于原型引导的类内一致性蒸馏(PGICD)模块以及交叉预测自适应蒸馏(CAD)模块,并在SIMD和MAR20数据集上进行实验验证。其中,WICD模块从特征图和语义查询向量中捕捉类间特征差异,利用高斯分布与连续Wasserstein距离,增强类间区分性;PGICD模块通过最小化教师网络和学生网络中实例的高层语义查询和低层特征图的原型差异,实现类内特征有效传递,增强类内一致性。CAD模块通过动态调整分类和回归分支的蒸馏权重,优化交叉预测蒸馏过程,缓解了增量学习中灾难性遗忘问题,提升了模型在复杂场景下的检测精度。在SIMD和MAR20数据集上的实验结果显示,本文方法在各类型的一步和多步增量场景下均表现优异,平均精度(AP)相比传统方法有显著提升,如在SIMD数据集8类+7类的增量场景下AP高达70.8%,与上限绝对差距为1.7%,相对差距为2.3%;在MAR20数据集10类+10类的增量场景下AP高达60.2%,与上限的绝对差距为2.3%,相对差距为3.6%。此外,通过消融实验验证了各模块有效性,有效地提升了无人机图像时敏目标增量检测性能。

本文引用格式

成桢灏 , 杨小冈 , 卢瑞涛 , 张涛 , 王思宇 . 基于多阶段蒸馏的无人机图像时敏目标增量检测算法 备注:“干扰环境下的无人机多源感知”专栏 (1)无人机图像目标识别与跟踪[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2025.31959

Abstract

To address the challenges of catastrophic forgetting, overfitting, and limited detection accuracy due to the difficulty in adapt-ing to dense detector characteristics in class-incremental detection of time-sensitive targets in UAV images, this paper pro-poses a time-sensitive target incremental detection algorithm based on multi-stage distillation. The algorithm primarily com-prises a Wasserstein-based Inter-Class Distillation (WICD) module, a Prototype Guided Intra-Class Consistency Distillation (PGICD) module, and a Cross Prediction Adaptive Distillation (CAD) module, with experimental validation conducted on the SIMD and MAR20 datasets. The WICD module captures inter-class feature differences from feature maps and semantic query vectors, leveraging Gaussian distribution and continuous Wasserstein distance to enhance inter-class discriminability. The PGICD module achieves effective intra-class feature transfer and strengthens intra-class consistency by minimizing pro-totype discrepancies between high-level semantic queries and low-level feature maps of instances in teacher and student net-works. The CAD module optimizes the cross-prediction distillation process by dynamically adjusting the distillation weights of classification and regression branches, alleviating catastrophic forgetting in incremental learning and improving the mod-el’s detection accuracy in complex scenarios. Experimental results on the SIMD and MAR20 datasets demonstrate that the proposed method performs excellently in various one-step and multi-step incremental scenarios, with significantly improved average precision (AP) compared to traditional methods. For example, in the 8-class + 7-class incremental scenario on the SIMD dataset, the AP reaches 70.8%, with an absolute gap of 1.7% and a relative gap of 2.3% compared to the upper bound. In the 10-class + 10-class incremental scenario on the MAR20 dataset, the AP reaches 60.2%, with an absolute gap of 2.3% and a relative gap of 3.6%. Additionally, ablation experiments verify the effectiveness of each module, effectively enhancing the incremental detection performance of time-sensitive targets in UAV images.
文章导航

/