基于稀疏点匹配的协同式未知目标跟踪方法

doi:10.7527/S1000-6893.2025.32425

Abstract

Abstract:

Real-time perception and continuous tracking of unknown objects are critical for autonomous intelligent systems. However， the absence of prior category knowledge and limited training samples make the perception and tracking of unknown targets highly challenging. To address this issue. we propose a category-agnostic object tracking method based on the Segment Anything Model （SAM） and sparse feature point matching. The approach first guides SAM to segment unknown objects using prompt points， then extracts sparse keypoints via a network-based feature extraction model， and matches them across frames using an attention-based network to propagate object information. An Iterative SAM with Point Consensus （ISPC） is introduced to maintain segmentation and achieve stable tracking over time. The lightweight target descriptors based on sparse feature points can be efficiently shared among multiple agents， enabling the construction of a collaborative target tracking system. Experiments on the DAVIS 2017 dateset and a self-constructed near-infrared video dataset demonstrate strong robustness and accuracy in collaborative perception and tracking of unknown-category objects.

Key words: object tracking, object segmentation, feature extraction, feature matching, collaborative perception

CLC Number:

V243

Rongling LANG, Cailun WEI, Ya FAN, Fei GAO. Sparse point matching-based collaborative category-agnostic object tracking method[J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(3): 632425.

Figures/Tables 10

Fig.1

Table 1

Quantitative comparison of object tracking and segmentation methods on DAVIS 2017 validation set

方法	$J & F$	$J$	$F$
Painter	34.6	28.5	40.8
DINO	71.4	67.9	76.9
SegGPT	75.6	72.5	78.6
SAM-PT	76.6	74.4	78.9
本文	75.8	73.2	78.4

Table 1

Table 2

Quantitative comparison on surveillance or low-dynamic platform sequences in DAVIS 2017 validation set

数据片段	本文		SAM-PT		SegGPT
数据片段	$J$	$F$	$J$	$F$	$J$	$F$
blackswan	95.4	97.3	95.2	97.1	93.4	98.2
camel	96.8	99.0	97.7	99.1	80.1	86.9
cows	94.8	96.0	95.3	95.0	90.4	93.0
dogs-jump	87.8	85.6	92.4	96.6	78.1	77.8
drift-chicane	94.2	98.4	93.1	98.4	87.1	97.5
drift-straight	95.6	96.8	87.5	88.1	92.4	95.4
gold-fish	86.4	89.3	88.7	89.9	75.5	72.9
judo	83.0	85.4	78.6	82.0	76.5	80.6
loading	93.1	96.2	80.0	82.1	87.2	89.7
mbike-trick	87.3	89.8	81.7	83.7	80.3	77.6
pigs	87.9	86.8	82.4	81.4	83.0	81.6
平均	91.1	92.7	88.4	90.3	84.0	86.4

Table 2

Fig.2

Fig.3

Fig.4

Fig.5

Table 3

Fig.6

Fig.7

References 25

[1]	CAELLES S， MANINIS K K， PONT-TUSET J， et al. One-shot video object segmentation［C］∥2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2017： 5320-5329.
[2]	WU J F， JIANG Y， BAI S， et al. SeqFormer： Sequential transformer forVideo instance segmentation［C］∥Computer Vision-ECCV 2022. Cham： Springer， 2022： 553-569.
[3]	CHENG H K， OH S W， PRICE B， et al. Putting the object back into video object segmentation［C］∥2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2024： 3151-3161.
[4]	WANG X L， WANG W， CAO Y， et al. Images speak in images： A generalist painter for in-context visual learning［C］∥2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2023： 6830-6839.
[5]	WANG X， ZHANG X， CAO Y， et al. Seggpt： Segmenting everything in context［DB/OL］. arXiv preprint：2304.03284， 2023.
[6]	JABRI A， OWENS A， EFROS A. Space-time correspondence as a contrastive random walk［J］. Advances in neural information processing systems， 2020， 33： 19545-19560.
[7]	CARON M， TOUVRON H， MISRA I， et al. Emerging properties in self-supervised vision transformers［C］∥2021 IEEE/CVF International Conference on Computer Vision （ICCV）. Piscataway： IEEE Press， 2021： 9630-9640.
[8]	KIRILLOV A， MINTUN E， RAVI N， et al. Segment anything［C］∥2023 IEEE/CVF International Conference on Computer Vision （ICCV）. Piscataway： IEEE Press， 2023： 3992-4003.
[9]	RAVI N， GABEUR V， HU Y T， et al. Sam 2： Segment anything in images and videos［DB/OL］. arXiv preprint： 2408.00714， 2024.
[10]	YANG J， GAO M， Li Z， et al. Track anything： Segment anything meets videos［DB/OL］. arXiv preprint：2304.11968， 2023.
[11]	CHENG Y， LI L， XU Y， et al. Segment and track anything［DB/OL］. arXiv preprint： 2305.06558， 2023.
[12]	CHENG H K， SCHWING A G. XMem： Long-term video object segmentation with an Atkinson-shiffrin memory model［C］∥Computer Vision-ECCV 2022. Cham： Springer， 2022： 640-658.
[13]	Decoupling features in hierarchical propagation for video object segmentation［C］∥Proceedings of the 36th International Conference on Neural Information Processing Systems. New York： ACM， 2022： 36324-36336.
[14]	ZHONG S， LI G Q， YING W H， et al. Efficient semisupervised object segmentation for long-term videos using adaptive memory network［J］. IEEE Transactions on Cognitive and Developmental Systems， 2024， 16（5）： 1789-1802.
[15]	RAJIČ F， KE L， TAI Y， et al. Segment anything meets point tracking［C］∥2025 IEEE/CVF Winter Conference on Applications of Computer Vision （WACV）. Piscataway： IEEE Press， 2025： 9302-9311.
[16]	HARLEY A W， FANG Z Y， FRAGKIADAKI K. Particle video revisited： tracking through occlusions using point trajectories［C］∥Computer Vision-ECCV 2022. Cham： Springer， 2022： 59-75.
[17]	WEN L Y， LEI Z， CHANG M C， et al. Multi-camera multi-target tracking with space-time-view hyper-graph［J］. International Journal of Computer Vision， 2017， 122（2）： 313-333.
[18]	XU K， WANG C， CHEN C， et al. AirCode： A robust object encoding method［J］. IEEE Robotics and Automation Letters， 2022， 7（2）： 1816-1823.
[19]	DETONE D， MALISIEWICZ T， RABINOVICH A. SuperPoint： Self-supervised interest point detection and description［C］∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops （CVPRW）. Piscataway： IEEE Press， 2018.
[20]	SARLIN P E， DETONE D， MALISIEWICZ T， et al. SuperGlue： Learning feature matching with graph neural networks［C］∥2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2020： 4938-4947.
[21]	SARLIN P E， CADENA C， SIEGWART R， et al. From coarse to fine： Robust hierarchical localization at large scale［C］∥2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Piscataway： IEEE Press， 2019： 12708-12717.
[22]	FISCHLER M A， BOLLES R C. Random sample consensus： A paradigm for model fitting with applications to image analysis and automated cartography［M］∥Readings in Computer Vision. Amsterdam： Elsevier， 1987： 726-740.
[23]	QUIGLEY M， CONLEY K， GERKEY B， et al. ROS： An open-source robot operating system［C］∥ICRA Workshop on Open Source Software. Piscataway： IEEE Press， 2009.
[24]	ESTER M， KRIEGEL H P， SANDER J， et al. A density-based algorithm for discovering clusters in large spatial databases with noise［C］∥KDD-96 Proceedings. Portland： AAAI Press， 1996： 226-231.
[25]	PONT-TUSET J， PERAZZI F， CAELLES S， et al. The 2017 davis challenge on video object segmentation［OB/OL］. arXiv preprint： 1704.00675， 2017.

Sparse point matching-based collaborative category-agnostic object tracking method

RichHTML

PDF (PC)

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 10

References 25

Related Articles 15

Recommended Articles

Metrics

Comments

[1]	Qishuai DING, Bangjun LEI, Zhengping WU. A lightweight single object tracking algorithm for UAVs based on Siamese network [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(4): 330925-330925.
[2]	Bin TANG, Xiaogang YANG, Ruitao LU, Zhenyu ZHANG, Shuang SU. Aircraft infrared/satellite heterogenous fast matching localization method based on image translation [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 631961-631961.
[3]	Dong GAO, Pujian LAI, Shilei WANG, Gong CHENG. RGB-T UAV object tracking based on feature-cooperative reconstruction [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 632017-632017.
[4]	Qi LIU, Zhixiang PEI, Le HUI, Mingyi HE, Yuchao DAI. Dual-branch feature aggregation for UAV visual place recognition [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(23): 632457-632457.
[5]	Chenhao ZHAO, Dewei WU, Jing HE, Qian WU. A semantic feature matching algorithm for UAV visual pose estimation [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(2): 330406-330406.
[6]	Ruokun QU, Zhiyuan WANG, Yelu LIU, Chenglong LI, Bo JIANG. UAV visual positioning technology for urban air mobility [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(11): 531168-531168.
[7]	Chuanyun WANG, Yang SU, Linlin WANG, Tian WANG, Jingjing WANG, Qian GAO. Multi-object continuous robust tracking algorithm for anti-UAV swarm [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(7): 329017-329017.
[8]	Baicheng JIN, Kuo TIAN, Lei HUANG. Parametric reconstruction method for topology optimization results of curved stiffened structures [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(24): 630586-630586.
[9]	Zhiwen YU, Zhuo SUN, Yue CHENG, Bin GUO. A review of intelligent UAV swarm collaborative perception and computation [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(20): 630912-630912.
[10]	Xinyu XU, Jian CHEN. UAV object tracking for air⁃ground targets based on status detection and Kalman filter [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(16): 329834-329834.
[11]	Zhaochen CHU, Tao SONG, Ren JIN, Defu LIN. Vision-based air-to-air multi-UAVs tracking [J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(14): 629379-629379.
[12]	Baichuan ZHANG, Wenhao BI, An ZHANG, Zeming MAO, Mi YANG. Transformer-based error compensation method for air combat aircraft trajectory prediction [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(9): 327413-327413.
[13]	Jinrui WANG, Shanshan JI, Zongzhen ZHANG, Zhenyun CHU, Baokun HAN, Huaiqian BAO. Parallel sparse filtering for fault diagnosis under bearing acoustic signal [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(4): 426887-426887.
[14]	Yutong ZHANG, Jianmei SONG, Yan DING, Jinpeng LIU. Heterogeneous collaborative SLAM based on fisheye and RGBD cameras [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(10): 327621-327621.
[15]	Yuanliang XUE, Guodong JIN, Lining TAN, Jiankun XU. Adaptive UAV target tracking algorithm based on multi-scale fusion [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023, 44(1): 326107-326107.