| [1]CAELLES S, MANINIS K K, PONT-TUSET J, et al.One-shot video object segmentation[C]//Proceedings of the IEEE conference on computer vision and pat-tern recognition. 221-230.[2]WU J, JIANG Y, BAI S, et al.Seqformer: Sequential transformer for video instance segmenta-tion[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 553-569.[3]CHENG H K, OH S W, PRICE B, et al.Putting the object back into video object segmenta-tion[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 3151-3161.[4]WANG X, WANG W, CAO Y, et al.Images speak in images: A generalist painter for in-context visual learning[C]//Proceedings of the IEEE/CVF Confer-ence on Computer Vision and Pattern Recognition. 2023: 6830-6839.[5]WANG X, ZHANG X, CAO Y, et al. Seggpt: Seg-menting everything in context[J]. arXiv prep.[J].rXiv:2304.03284, 2023., rint, :-[6]JABRI A, OWENS A, EFROS A.Space-time corre-spondence as a contrastive random walk[J]. Advances in neural information processing systems, 2020, 33: 19545-19560.[J].Advances in neural information processing systems, 2020, (33):19545-19560[7]CARON M, TOUVRON H, MISRA I, et al.Emerging properties in self-supervised vision transform-ers[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2021: 9650-9660.[8]KIRILLOV A, MINTUN E, RAVI N, et al.Segment anything[C]//Proceedings of the IEEE/CVF interna-tional conference on computer vision. 2023: 4015-4026.[9]RAVI N, GABEUR V, HU Y T, et al. Sam 2: Segment anything in images and videos[J]. arXiv prep.[J].rXiv:2408.00714, 2024., rint, :-[10]YANG J, GAO M, Li Z, et al. Track anything: Seg-ment anything meets videos[J]. arXiv prep.[J].rXiv:2304.11968, 2023., rint, :-[11]CHENG Y, LI L, XU Y, et al. Segment and track any-thing[J]. arXiv prep.[J].rXiv:2305.06558, 2023., rint, :-[12]CHENG H K, SCHWING A G.Xmem: Long-term video object segmentation with an atkinson-shiffrin memory model[C]//European Conference on Com-puter Vision. Cham: Springer Nature Switzerland, 2022: 640-658.[13]YANG Z, YANG Y.Decoupling features in hierar-chical propagation for video object segmentation[J]. Advances in Neural Information Processing Systems, 2022, 35: 36324-36336.[14]ZHONG S, LI G, YING W, et al.Efficient Semi-Supervised Object Segmentation for Long-Term Vid-eos Using Adaptive Memory Network[J]. IEEE Trans-actions on Cognitive and Developmental Systems, 2024.[15]RAJI? F, KE L, TAI Y W, et al.Segment anything meets point tracking[C]//2025 IEEE/CVF Winter Con-ference on Applications of Computer Vision (WACV). IEEE, 2025: 9302-9311.[16]HARLEY A W, FANG Z, FRAGKIADAKI K.Particle video revisited: Tracking through occlusions using point trajectories[C]//European Conference on Com-puter Vision. Cham: Springer Nature Switzerland, 2022: 59-75.[17]DETONE D, MALISIEWICZ T, RABINOVICH A.Superpoint: Self-supervised interest point detection and description[C]//Proceedings of the IEEE confer-ence on computer vision and pattern recognition workshops. 2018: 224-236.[18]SARLIN P E, DETONE D, MALISIEWICZ T, et al.Superglue: Learning feature matching with graph neural networks[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recogni-tion. 2020: 4938-4947.[19]SARLIN P E, CADENA C, SIEGWART R, et al.From coarse to fine: Robust hierarchical localization at large scale[C]//Proceedings of the IEEE/CVF confer-ence on computer vision and pattern recognition. 2019: 12716-12725.[20]FISCHLER M A, BOLLES R C.Random sample con-sensus: a paradigm for model fitting with applications to image analysis and automated cartography[J].Communications of the ACM, 1981, 24(6):381-395[21]QUIGLEY M, CONLEY K, GERKEY B, et al.ROS: an open-source Robot Operating System[C]//ICRA workshop on open source software. 2009, 3(3.2): 5.[22]ESTER M, KRIEGEL H P, SANDER J, et al.Density-based spatial clustering of applications with noise[C]//Int. Conf. knowledge discovery and data mining. 1996, 240(6). |