基于频域特征和Transformer的无人机目标跟踪算法

doi:10.7527/S1000-6893.2025.32791

电子电气工程与控制

本期目录 | 过刊浏览 | 高级检索

前一篇 |

基于频域特征和Transformer的无人机目标跟踪算法

刘芳¹, 崔静虎¹(), 卢晨阳¹, 王鑫², 浦昭辉³

^1.北京工业大学信息科学技术学院，北京 100124
^2.国网北京丰台供电公司，北京 100161
^3.国网北京市电力公司信通分公司，北京 100761

收稿日期:2025-09-17 修回日期:2025-10-23 接受日期:2025-11-20 出版日期:2025-12-01 发布日期:2025-11-28
通讯作者: 崔静虎 E-mail:S202487019@emails.bjut.edu.cn

A UAV target tracking algorithm based on frequency-domain feature and transformer

Fang LIU¹, Jinghu CUI¹(), Chenyang LU¹, Xin WANG², Zhaohui PU³

^1.School of Information Science and Technology，Beijing University of Technology，Beijing 100124，China
^2.Fengtai Power Supply Bureau of Beijing Power Supply Bureau，Beijing 100161，China
^3.Information and Communication Branch of State Grid Beijing Electric Power Company，Beijing 100761，China

Received:2025-09-17 Revised:2025-10-23 Accepted:2025-11-20 Online:2025-12-01 Published:2025-11-28
Contact: Jinghu CUI E-mail:S202487019@emails.bjut.edu.cn

摘要/Abstract

摘要：

随着无人机技术的不断发展，目标跟踪已成为无人机应用的关键技术之一。针对无人机目标跟踪中，目标易发生遮挡、形变、尺度变化以及多视角变化等问题，提出一种基于频域特征和Transformer的无人机目标跟踪算法。首先，采用蒸馏后的Transformer深度网络提取图像空间全局特征，随后利用自适应频域感知网络提取频域细节特征，同时在输入端增添学习图像作为补充，以捕获目标模块与搜索区域之间的相关性，用于更新初始目标模板，增强对目标的表征能力。其次，提出一种基于互信息最大化的多视角不变特征学习策略，通过最大化目标模板与搜索模板之间的互信息设计新的损失函数，提升跟踪网络处理目标变化的能力。最后，根据学习图像特征响应确定目标位置。仿真实验结果表明，该算法能够有效提升无人机目标跟踪的精度，具有较好的鲁棒性。

关键词: 机器视觉, 无人机, 目标跟踪, 频域特征, 深度网络

Abstract:

With the rapid development of Unmanned Aerial Vehicle （UAV） technology， target tracking has become one of the key techniques in UAV applications. To address challenges such as occlusion， deformation， scale variation， and multi-view changes in UAV target tracking， this paper proposes a UAV target tracking algorithm based on frequency-domain feature and Transformer architecture. First， a distilled Transformer network is employed to extract global spatial features from images， and an adaptive frequency-domain deep network is employed to capture detailed frequency-domain features. meanwhile， a learning image is introduced at the input stage to capture the correlation between the target template and the search region， thereby updating the initial target template and enhancing target representation. Second， a multi-view invariant feature learning strategy based on mutual information maximization is proposed. By maximizing the mutual information between the target template and the search template， a novel loss function is designed to improve the network’s robustness against target appearance variations. Finally， the target position is determined according to the feature responses of the learning image. Simulation results demonstrate that the proposed algorithm effectively improves UAV target tracking accuracy and exhibits strong robustness under complex scenarios.

Key words: machine vision, unmanned aerial vehicle, target tracking, frequency-domain feature, deep network

中图分类号:

V279

刘芳, 崔静虎, 卢晨阳, 王鑫, 浦昭辉. 基于频域特征和Transformer的无人机目标跟踪算法[J]. 航空学报, 2026, 47(8): 332791.

Fang LIU, Jinghu CUI, Chenyang LU, Xin WANG, Zhaohui PU. A UAV target tracking algorithm based on frequency-domain feature and transformer[J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(8): 332791.

图/表 7

图 1

图 2

图 3

图 4

表1

表2

图 5

参考文献 33

[1]	管皓，薛向阳，安志勇. 深度学习在视频目标跟踪中的应用进展与展望［J］. 自动化学报， 2016， 42（6）： 834-847.
	GUAN H， XUE X Y， AN Z Y. Advances on application of deep learning for video object tracking［J］. Acta Automatica Sinica， 2016， 42（6）： 834-847 （in Chinese）.
[2]	刘芳，杨安喆，吴志威. 基于自适应Siamese网络的无人机目标跟踪算法［J］. 航空学报， 2020， 41（1）： 323423.
	LIU F， YANG A Z， WU Z W. Adaptive Siamese network based UAV target tracking algorithm［J］. Acta Aeronautica et Astronautica Sinica， 2020， 41（1）： 323423 （in Chinese）.
[3]	KUGARAJEEVAN J， KOKUL T， RAMANAN A， et al. Transformers in single object tracking： An experimental survey［J］. IEEE Access， 2023， 11： 80297-80326.
[4]	YE B T， CHANG H， MA B P， et al. Joint feature learning and Relation modeling for Tracking： A one-stream framework［C］∥Computer Vision-ECCV 2022. Cham： Springer， 2022： 341-357.
[5]	LIN L T， FAN H， ZHANG Z P， et al. SwinTrack： A simple and strong baseline for transformer tracking［DB/OL］. arXiv preprint： 2112.00995， 2021.
[6]	CUI Y T， JIANG C， WANG L M， et al. MixFormer： End-to-end tracking with iterative mixed attention［C］∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2022： 13598-13608.
[7]	刘芳，卢晨阳，路言，等. 基于自适应模板更新的Transformer无人机目标跟踪算法［J］. 航空学报， 2025， 46（16）： 331687.
	LIU F， LU C Y， LU Y， et al. Adaptive template update-based transformer algorithm for UAV target tracking［J］. Acta Aeronautica et Astronautica Sinica， 2025， 46（16）： 331687 （in Chinese）.
[8]	杨曦，李少毅，王晓田，等. 复杂干扰环境下基于频域Gabor滤波和相关滤波的空中目标跟踪算法［J］. 西北工业大学学报， 2020， 38（6）： 1146-1153.
	YANG X， LI S Y， WANG X T， et al. Aerial target tracking based on frequency-domain Gabor filters and correlation filters under complex interference environment［J］. Northwestern polytechnical university， 2020， 38（6）： 1146-1153 （in Chinese）.
[9]	孙培盛，樊佳庆，宋慧慧. 利用自适应频域滤波和凝聚损失的目标跟踪［J］. 计算机与数字工程， 2025， 53（3）： 725-733.
	SUN P S， FAN J Q， SONG H H. Object tracking via adaptive frequency domain filter and condensation loss［J］. Computer and Digital Engineering， 2025， 53（3）： 725-733 （in Chinese）.
[10]	CUI Y T， SONG T H， WU G S， et al. MixFormerV2： Efficient fully transformer tracking［DB/OL］. arXiv preprint： 2305. 15896， 2023.
[11]	CAO Z A， HUANG Z Y， PAN L， et al. TCTrack： Temporal contexts for aerial tracking［C］∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2022： 14778-14788.
[12]	CHEN X， YAN B， ZHU J W， et al. Transformer tracking［C］∥2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2021： 8122-8131.
[13]	LAW H， DENG J. CornerNet： Detecting objects as paired keypoints［C］∥Computer Vision-ECCV 2018. Cham： Springer， 2018： 765-781.
[14]	REZATOFIGHI H， TSOI N， GWAK J， et al. Generalized intersection over union： A metric and a loss for bounding box regression［C］∥2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2020： 658-666.
[15]	WU Y， LI Y X， LIU M Y， et al. Learning an adaptive and view-invariant vision transformer for real-time UAV tracking［DB/OL］. arXiv preprint： 2412. 20002， 2024.
[16]	YAN B， PENG H W， FU J L， et al. Learning spatio-temporal transformer for visual tracking［C］∥2021 IEEE/CVF International Conference on Computer Vision （ICCV）. Piscataway： IEEE Press， 2022： 10428-10437.
[17]	MACKAY D J C. Information theory， inference， and learning algorithms［M］. Cambridge： Cambridge University Press， 2003.
[18]	BELGHAZI M I， BARATIN A， RAJESHWAR S， et al. Mutual information neural estimation［C］∥Proceedings of the 35th International Conference on Machine Learning， 2018.
[19]	YAO Z W， GHOLAMI A， SHEN S， et al. ADAHESSIAN： An adaptive second order optimizer for machine learning［J］. Proceedings of the AAAI Conference on Artificial Intelligence， 2021， 35（12）： 10665-10673.
[20]	HUANG L H， ZHAO X， HUANG K Q. GOT-10k： A large high-diversity benchmark for generic object tracking in the wild［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2021， 43（5）： 1562-1577.
[21]	LIN T Y， MAIRE M， BELONGIE S， et al. Microsoft COCO： Common objects in context［C］∥Computer Vision-ECCV 2014. Cham： Springer， 2014： 740-755.
[22]	FAN H， LIN L T， YANG F， et al. LaSOT： A high-quality benchmark for large-scale single object tracking［C］∥2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2020： 5369-5378.
[23]	MUELLER M， SMITH N， GHANEM B. A benchmark and simulator for UAV tracking［M］∥Computer Vision-ECCV 2016. Cham： Springer， 2016： 445-461.
[24]	WU Y， LIM J， YANG M H. Online object tracking： A benchmark［C］∥2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE Press， 2013： 2411-2418.
[25]	容一民. 基于频域建模的目标跟踪算法研究［D］. 桂林：广西师范大学， 2024： 24-40.
	RONG Y M. Research on target tracking algorithm based on frequency domain modeling［D］. Guilin： Guangxi Normal University， 2024： 24-40 （in Chinese）.
[26]	邵延华，陈慧玲，付贵，等. 融合图像增强的正则化相关滤波无人机目标跟踪［J］. 中国图象图形学报， 2025， 30（10）： 3302-3318.
	SHAO Y H， CHEN H L， FU G， et al. Fuse image enhancement with a regularized correlation filter for target tracking of UAVs［J］. Journal of Image and Graphics， 2025， 30（10）： 3302-3318 （in Chinese）.
[27]	GOPAL G Y， AMER M A. Separable self and mixed attention transformers for efficient object tracking［C］∥2024 IEEE/CVF Winter Conference on Applications of Computer Vision （WACV）. Piscataway： IEEE Press， 2024： 6694-6703.
[28]	BHAT G， DANELLJAN M， VAN GOOL L， et al. Learning discriminative model prediction for tracking［C］∥2019 IEEE/CVF International Conference on Computer Vision （ICCV）. Piscataway： IEEE Press， 2020： 6181-6190.
[29]	ZUO H B， FU C H， LI S H， et al. Adversarial blur-deblur network for robust UAV tracking［J］. IEEE Robotics and Automation Letters， 2023， 8（2）： 1101-1108.
[30]	BORSUK V， VEI R， KUPYN O， et al. FEAR： Fast， efficient， accurate and robust visual tracker［C］∥Computer Vision-ECCV 2022. Cham： Springer， 2022： 644-663.
[31]	CHEN X， KANG B， WANG D， et al. Efficient visual tracking via Hierarchical cross-attention transformer［C］∥Computer Vision-ECCV 2022. Cham： Springer， 2023： 461-477.
[32]	DANELLJAN M， HÄGER G， KHAN F S， et al. Discriminative scale space tracking［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（8）： 1561-1575.
[33]	LI Y M， FU C H， DING F Q， et al. AutoTrack： Towards high-performance visual tracking for UAV with automatic spatio-temporal regularization［C］∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2020： 11920-11929.

E-mail：hkxb@buaa.edu.cn

关于我们

期刊社服务

专业学科

封面文章

友情链接

主管单位：中国科学技术协会主办单位：中国航空学会北京航空航天大学

实验序号	普通模板更新	自适应频域感知网络	多视角不变特征学习策略	学习图像更新模板	AO/%	SR_0.5/%	GPU FPS	参数量/M
1	√	×	×	×	71.7	80.2	165	20
2	√	√	×	×	73.3	83.4	152	26
3	√	√	√	×	74.5	84.8	151	27
4	×	√	√	√	74.6	84.2	156	24

跟踪算法	OTB70		LaSOT		UAV123		GPU FPS/ （帧·s^-1）
跟踪算法	精度/%	成功率/%	精度/%	成功率/%	精度/%	成功率/%	GPU FPS/ （帧·s^-1）
Ours	82.1	63.9	66.5	64.0	90.2	68.8	156
F-Resnet			62.0	59.9	77.2
SpectralTracker			69.1	65.7	78.2	67.4	52
IRE-UFT					69.1	65.3
SMAT	81.9	63.8	64.6	61.7	81.8	64.6	124
DiMP			56.7	56.9		65.4	77
TCTrack	81.2	62.2			80.8	60.6	139
ABDNet	76.8	59.6			79.3	60.7	130
FEAT-L			60.9	57.9	86.6	65.8	80
HCAT	60.6	58.3	60.5	59.0	63.6		195
FDSST	53.4	35.7			58.3	40.5
AutoTrack	71.6	47.8			68.9	47.2	58

基于频域特征和Transformer的无人机目标跟踪算法

A UAV target tracking algorithm based on frequency-domain feature and transformer

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 33

相关文章 15

编辑推荐

Metrics

本文评价

[1]	徐淑芳, 费文轩, 李恒, 高红民. 基于三阶段优化的无人机-无人车空地协同路径规划方法[J]. 航空学报, 2026, 47(7): 332649-332649.
[2]	田秋扬, 王泽林, 胡天江. 历史轨迹驱动无人机自主着陆迭代学习控制[J]. 航空学报, 2026, 47(7): 332634-332634.
[3]	高思华, 赵炳阳, 李建伏. 基于时间窗约束的无人机完整性数据采集路径规划算法[J]. 航空学报, 2026, 47(6): 332451-332451.
[4]	王沛曌, 何明, 陈海华, 王鸿鹏. 考虑通信拓扑控制的FANET实时任务调度算法[J]. 航空学报, 2026, 47(6): 332636-332636.
[5]	郭鹏, 徐田来, 郎安琪, 崔祜涛, 李子迪. 基于复杂网络模型的多无人机系统协同导航信息融合方法[J]. 航空学报, 2026, 47(5): 332428-332428.
[6]	伍瀚, 孙浩, 刘奎, 计科峰, 匡纲要. 无人机视频多目标特征关联技术研究进展[J]. 航空学报, 2026, 47(4): 331967-331967.
[7]	冯子成, 张文龙, 刘冬辉, 于起峰. 复杂背景下反无人机红外目标鲁棒跟踪算法[J]. 航空学报, 2026, 47(4): 332264-332264.
[8]	郭玉英, 廖兰馨, 张晓强, 张友民, 王凯. 无人机复合执行器故障有限时间容错控制[J]. 航空学报, 2026, 47(4): 332659-332659.
[9]	马恩淳, 白向龙, 潘泉, 王增福. 分类信息辅助的自适应检测与跟踪方法[J]. 航空学报, 2026, 47(3): 631553-631553.
[10]	张磊, 田灿, 文方青, 张清河, 刘含. 面向移动边缘网络的多目标进化深度确定性策略梯度算法[J]. 航空学报, 2026, 47(3): 631880-631880.
[11]	郎荣玲, 魏才伦, 范亚, 高飞. 基于稀疏点匹配的协同式未知目标跟踪方法[J]. 航空学报, 2026, 47(3): 632425-632425.
[12]	张陈鹏, 艾渤, 王公仆, 刘铭, 许荣涛. 基于反向散射的定位技术：原理与挑战及航空场景应用[J]. 航空学报, 2026, 47(3): 632635-632635.
[13]	王浩宇, 张泽旭, 闻单, 刘金龙, 朱倍孝, 包为民. 基于时序耦合分析的无人机集群任务分配方法[J]. 航空学报, 2026, 47(2): 332075-332075.
[14]	赵长啸, 方玉麟, 汪克念. 基于BiTCN的无人机指挥控制链路DoS攻击检测方法[J]. 航空学报, 2026, 47(1): 332048-332048.
[15]	贺炅, 任斌武, 杜思亮, 徐尤松, 王博. 基于ADRC-RBF倾转四旋翼无人机姿态自适应控制[J]. 航空学报, 2025, 46(S1): 732189-732189.