A survey of few shot learning based on intelligent perception

  • SONG Chuang ,
  • ZHAO Jiajia ,
  • WANG Kang ,
  • LIANG Xinkai
Expand
  • 1. Science and Technology on Complex System Control and Intelligent Agent Cooperation Laboratory, Beijing 100074, China;
    2. School of Computer and Technique, Fudan University, Shanghai 200433, China

Received date: 2019-12-13

  Revised date: 2019-12-20

  Online published: 2019-12-26

Supported by

Defense Industrial Technology Development Program (JCKY2017204B064)

Abstract

Few-shot learning refers to using only a small amount of supervision information of the target class to train the machine learning model. Due to its practical values, recent advances in few-shot learning by academia and industry have made significant contributions. However, there were few reviews on this issue in China. This paper systematically summarizes and explores the few-shot learning algorithms and the object detection algorithms based on few-shot learning. Firstly, the problem definition of few-shot learning is given, and its connections with other classic machine learning problems are also enumerated. Meanwhile, the theoretical challenges of the problem of few-shot learning are explained. Then, we summarize the image classification based on few-shot learning, and analyze its representative works. Based on this, we focus on the problem of few-shot object detection, especially the problem of zero-shot object detection, and analyze the existing research works in detail. Finally, we look forward to the future development of few-shot learning in terms of problem setting, theoretical research, implementation technology, and application scenarios based on the advantages and disadvantages of the existing methods. It is expected to provide inspirations for the subsequent research works in this field.

Cite this article

SONG Chuang , ZHAO Jiajia , WANG Kang , LIANG Xinkai . A survey of few shot learning based on intelligent perception[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020 , 41(S1) : 723756 -723756 . DOI: 10.7527/S1000-6893.2019.23756

References

[1] DENG J, DONG W, SOCHER R, et al. ImageNet:A large-scale hierarchical image database[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2009.
[2] KRIZHEVSKY A, SUTSKEVER I, HINTON G. ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems, 2012.
[3] HE K, ZHANG X, REN S. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016.
[4] LONG J, EVAN S, TREVOR D. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2015.
[5] REN S, HE K, GIRSHICK R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 39(6):1137-1149.
[6] HOU X, ZHANG L. Saliency detection:A spectral residual approach[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2007.
[7] WANG Y, GIRSHICK R, HEBERT M, et al. Low-shot learning from imaginary data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018.
[8] LAMPERT C H, NICKISCH H, HARMELING S. Attribute-based classification for zero-shot visual object categorization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 36(3):453-465.
[9] SEBASTIAN T, PRATT L. Learning to learn[M]. Norwell:Springer Science & Business Media, 2012.
[10] SCHAAL S. Is imitation learning the route to humanoid robots[J]. Trends in Cognitive Sciences, 1999, 3(6):233.
[11] GARCIA V, BRUNA J. Few-shot learning with graph neural networks[EB/OL]. (2018-02-20)[2019-11-20]. https://arxiv.org/abs/1711.04043v1.
[12] DUAN Y, ANDRYCHOWICZ M, STADIE B, et al. One-shot imitation learning[C]//Advances in Neural Information Processing Systems, 2017:1087-1098.
[13] ORESHKIN B, LÓPEZ P R, LACOSTE A. Tadam:Task dependent adaptive metric for Improved mproved few-shot learning[C]//Advances in Neural Information Processing Systems, 2018:721-731.
[14] REN M, TRIANTAFILLOU E, RAVI S, et al. Meta-learning for semi-supervised few-shot classification[EB/OL]. (2018-02-02)[2019-11-20]. https://arXivpreprintarXiv:1803.00676.
[15] ROMERA-PAREDES B, TORR P. An embarrassingly simple approach to zero-shot learning[C]//International Conference on Machine Learning, 2015:2152-2161.
[16] CHANGPINYO S, CHAO W L, GONG B, et al. Synthesized classifiers for zero-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016:5327-5336.
[17] KODIROV E, XIANG T, FU Z, et al. Unsupervised domain adaptation for zero-shot learning[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2015:2452-2460.
[18] ZHANG Z, SALIGRAMA V. Zero-shot learning via joint latent similarity embedding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016:6034-6042.
[19] PAN S J, YANG Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 22(10):1345-1359.
[20] TURK M A, PENTLAND A P. Face recognition using eigenfaces[C]//Proceedings of 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 1991:586-591.
[21] KINGMA D P, MOHAMED S, REZENDE D J, et al. Semi-supervised learning with deep generative models[C]//Advances in Neural Information Processing Systems, 2014:3581-3589.
[22] BUCCINO G, VOGT S, RITZL A, et al. Neural circuits underlying imitation learning of hand actions:An event-related fMRI study[J]. Neuron, 2004, 42(2):323-334.
[23] SMEULDERS A W M, WORRING M, SANTINI S, et al. Content-based image retrieval at the end of the early years[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2000(12):1349-1380.
[24] BLACKMAN S. Multiple-target tracking with radar applications[M]. Dedham:Artech House, Inc., 1986.
[25] FREEMAN W T, ROTH M. Orientation histograms for hand gesture recognition[C]//International Workshop on Automatic Face and Gesture Recognition, 1995:296-301.
[26] XU K, BA J, KIROS R, et al. Show, attend and tell:Neural image caption generation with visual attention[C]//International Conference on Machine Learning, 2015:2048-2057.
[27] ANTOL S, AGRAWAL A, LU J, et al. Vqa:Visual question answering[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2015:2425-2433.
[28] MEDIONI G, COHEN I, BRÉMOND F, et al. Event detection and analysis from video streams[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(8):873-889.
[29] BENGIO Y, DUCHARME R, VINCENT P, et al. A neural probabilistic language model[J]. Journal of Machine Learning Research, 2003, 3(2):1137-1155.
[30] ZOPH B, LE Q V. Neural architecture search with reinforcement learning[EB/OL]. (2016-11-5)[2019-11-20]. https://arxiv.xilesou.top/pdf/1611.01578.
[31] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2017:2980-2988.
[32] WANG Y, YAO Q. Few-shot learning:A survey[EB/OL]. (2019-04-10)[2019-03-13]. https://arxiv.xilesou.top/pdf/1904.05046.
[33] RUSSAKOVSKY O, LI F F. Attribute learning in large-scale datasets[C]//European Conference on Computer Vision. Heidelberg:Springer, 2010:1-14.
[34] VILALTA R, DRISSI Y. A perspective view and survey of meta-learning[J]. Artificial Intelligence Review, 2002, 18(2):77-95.
[35] KODIROV E, XIANG T, FU Z, et al. Unsupervised domain adaptation for zero-shot learning[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2015:2452-2460.
[36] ZHU X, GOLDBERG A B. Introduction to semi-supervised learning[J]. Synthesis Lectures on Artificial Intelligence and Machine Learning, 2009, 3(1):1-130.
[37] BARLOW H B. Unsupervised learning[J]. Neural Computation, 1989, 1(3):295-311.
[38] ROSENBERG C, HEBERT M, SCHNEIDERMAN H. Semi-supervised self-training of object detection models[C]//2005 Seventh IEEE Workshops on Applications of Computer Vision. Piscataway:IEEE Press, 2005.
[39] ZHOU Z H. A brief introduction to weakly supervised learning[J]. National Science Review, 2018, 5(1):44-53.
[40] TORREY L, SHAVLIK J. Transfer learning[M]//Handbook of research on machine learning applications and trends:Algorithms, methods, and techniques. Hershey:IGI Global, 2009:242-264.
[41] BROWN A L, CAMPIONE J C, DAY J D. Learning to learn:On training students to learn from texts[J]. Educational Researcher, 1981, 10(2):14-21.
[42] HOCHREITER S, YOUNGER A S, CONWELL P R. Learning to learn using gradient descent[C]//International Conference on Artificial Neural Networks. Heidelberg:Springer, 2001:87-94.
[43] PALATUCCI M, POMERLEAU D, HINTON G E, et al. Zero-shot learning with semantic output codes[C]//Advances in Neural Information Processing Systems, 2009:1410-1418.
[44] ZHANG Z, SALIGRAMA V. Zero-shot learning via semantic similarity embedding[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2015:4166-4174.
[45] CHANGPINYO S, CHAO W L, GONG B, et al. Synthesized classifiers for zero-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016:5327-5336.
[46] KODIROV E, XIANG T, GONG S. Semantic autoencoder for zero-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2017:3174-3183.
[47] ZHANG T, JOHNSON D. A robust risk minimization based named entity recognition system[C]//Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4, 2003:204-207.
[48] CHEN W Y, LIU Y C, KIRA Z, et al. A closer look at few-shot classification[EB/OL]. (2019-04-08)[2019-01-12]. https://arxiv.xilesou.top/pdf/1904.04232.
[49] VAN DYK D A, MENG X L. The art of data augmentation[J]. Journal of Computational and Graphical Statistics, 2001, 10(1):1-50.
[50] HARIHARAN B, GIRSHICK R. Low-shot visual recognition by shrinking and hallucinating features[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2017:3018-3027.
[51] CHEN Z, FU Y, ZHANG Y, et al. Multi-level semantic feature augmentation for one-shot learning[J]. IEEE Transactions on Image Processing, 2019, 28(9):4594-4605.
[52] HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786):504-507.
[53] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Advances in Neural Information Processing Systems, 2014:2672-2680.
[54] PERARNAU G, VAN DE WEIJER J, RADUCANU B, et al. Invertible conditional gans for image editing[EB/OL].[2016-11-19]. https://arxiv.xilesou.top/pdf/1611.06355.
[55] CHU C, ZHMOGINOV A, SANDLER M. Cyclegan, a master of steganography[EB/OL].(2017-12-08)[2017-12-16]. https://arxiv.xilesou.top/pdf/1712.02950.
[56] HOSSEINI-ASL E, ZHOU Y, XIONG C, et al. Augmented cyclic adversarial learning for low resource domain adaptation[EB/OL]. (2018-07-01)[2019-01-23]. https://arxiv.xilesou.top/pdf/1807.00374.
[57] JUANG B H, RABINER L R. Hidden Markov models for speech recognition[J]. Technometrics, 1991, 33(3):251-272.
[58] CHEN Z, FU Y, CHEN K, et al. Image block augmentation for one-shot learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019:3379-3386.
[59] CHEN Z, FU Y, WANG Y X, et al. Image deformation meta-networks for one-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2019:8680-8689.
[60] VINYALS O, BLUNDELL C, LILLICRAP T, et al. Matching networks for one shot learning[C]//Advances in Neural Information Processing Systems, 2016:3630-3638.
[61] SNELL J, SWERSKY K, ZEMEL R. Prototypical networks for few-shot learning[C]//Advances in Neural Information Processing Systems, 2017:4077-4087.
[62] SUNG F, YANG Y, ZHANG L, et al. Learning to compare:Relation network for few-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018:1199-1208.
[63] BERTINETTO L, HENRIQUES J F, TORR P H S, et al. Meta-learning with differentiable closed-form solvers[EB/OL]. (2018-05-21)[2019-12-20]. https://arxiv.xilesou.top/pdf/1805.08136.
[64] GARCIA V, BRUNA J. Few-shot learning with graph neural networks[EB/OL]. (2017-11-10)[2019-12-20]. https://arxiv.xilesou.top/pdf/1711.04043.
[65] HUANG Z, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[EB/OL]. (2015-08-09)[2019-12-20]. https://arxiv.xilesou.top/pdf/1508.01991.
[66] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems, 2017:5998-6008.
[67] QI H, BROWN M, LOWE D G. Low-shot learning with imprinted weights[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018:5822-5830.
[68] GIDARIS S, KOMODAKIS N. Dynamic few-shot visual learning without forgetting[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2018:4367-4375.
[69] MOTIIAN S, PICCIRILLI M, ADJEROH D A, et al. Unified deep supervised domain adaptation and generalization[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2017:5715-5725.
[70] MOTIIAN S, JONES Q, IRANMANESH S, et al. Few-shot adversarial domain adaptation[C]//Advances in Neural Information Processing Systems, 2017:6670-6680.
[71] FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the 34th International Conference on Machine Learning, 2017:1126-1135.
[72] RUSU A A, RAO D, SYGNOWSKI J, et al. Meta-learning with latent embedding optimization[EB/OL]. (2018-07-16)[2019-12-20]. https://arxiv.xilesou.top/pdf/1807.05960.
[73] RAVI S, LAROCHELLE H. Optimization as a model for few-shot learning[C]//International Conference on Learning Representations (ICLR), 2017.
[74] SANTORO A, BARTUNOV S, BOTVINICK M, et al. Meta-learning with memory-augmented neural networks[C]//International Conference on Machine Learning, 2016:1842-1850.
[75] HILLIARD N, PHILLIPS L, HOWLAND S, et al. Few-shot learning with metric-agnostic conditional embeddings[EB/OL]. (2018-02-12)[2019-12-20]. https://arxiv.xilesou.top/pdf/1802.04376.
[76] KINGMA D P, BA J. Adam:A method for stochastic optimization[EB/OL]. (2014-12-22)[2019-12-20]. https://arxiv.xilesou.top/pdf/1412.6980.
[77] DAI J, HE K, SUN J. Instance-aware semantic segmentation via multi-task network cascades[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016:3150-3158.
[78] WEI S E, RAMAKRISHNA V, KANADE T, et al. Convolutional pose machines[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016:4724-4732.
[79] ANDERSON P, HE X, BUEHLER C. Bottom-up and top-down attention for image captioning and visual question answering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016:6077-6086.
[80] 刘芳, 王洪娟, 黄光伟, 等. 基于自适应深度网络的无人机目标跟踪算法[J]. 航空学报, 2019,40(3):322332. LIU F, WANG H J, HUANG G W, et al. UAV target tracking algorithm based on adaptive depth network[J]. Acta Aeronautica et Astronautica Sinica,2019,40(3):322332(in Chinese).
[81] 张菁, 何友, 彭应宁, 等. 基于神经网络和人工势场的协同博弈路径规划[J]. 航空学报, 2019,40(3):322493. ZHANG J, HE Y, PENG Y N, et al. Neural network and artificial potential field based cooperative and adversarially path planning[J]. Acta Aeronautica et Astronautica Sinica, 2019, 40(3):322493(in Chinese).
[82] 石叶楠, 郑国磊. 三种用于加工特征识别的神经网络方法综述[J]. 航空学报, 2019, 40(9):022840. SHI Y N, ZHENG G L. A review of three neural network methods for manufacturing feature recognition[J]. Acta Aeronautica et Astronautica Sinica, 2019, 40(9):022840(in Chinese).
[83] 王华夏, 程咏梅, 刘楠. 面向山地区域光照变化下的鲁棒景象匹配方法[J]. 航空学报, 2017,38(10):321101. WANG H X, CHENG Y M, LIU N. A robust scene matching method for mountainous region with illumination variation[J]. Acta Aeronautica et Astronautica Sinica,2017,38(10):321101(in Chinese).
[84] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco:Common objects in context[C]//European Conference on Computer Vision, 2014:740-755.
[85] KUZNETSOVA A, ROM H, ALLDRIN N, et al. The open images dataset v4:Unified image classification, object detection, and visual relationship detection at scale[EB/OL]. (2018-11-02)[2019-12-20]. https://arxiv.xilesou.top/pdf/1811.00982.
[86] REN S, HE K, GIRSHICK R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems, 2015:91-99.
[87] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once:Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2016:779-788.
[88] SUN Z, BEBIS G, MILLER R. On-road vehicle detection:A review[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2006(5):694-711.
[89] DOLLÁR P, WOJEK C, SCHIELE B, et al. Pedestrian detection:A benchmark[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE Press, 2009:304-311.
[90] SCHWARTZ E, KARLINSKY L, SHTOK J, et al. RepMet:Representative-based metric learning for classification and one-shot object detection[EB/OL]. (2018-06-12)[2019-12-20]. https://arXiv.preprintarXiv:1806.04728.
[91] CHEN H, WANG Y, WANG G, et al. A low-shot transfer detector for object detection[C]//Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
[92] ZHANG T, ZHANG Y, SUN X, et al. Comparison network for one-shot conditional object detection[EB/OL]. (2019-4-4)[2019-12-20]. https://arxiv.xilesou.top/pdf/1904.02317.
[93] HSIEH T I, LO Y C, CHEN H T, et al. One-shot object detection with co-attention and co-excitation[C]//Advances in Neural Information Processing Systems, 2019:2721-2730.
[94] FAN Q, ZHUO W, TAI Y W. Few-shot object detection with attention-RPN and multi-relation detector[EB/OL] (2019-08-06)[2019-12-20]. https://arxiv.org/abs/1908.01998.
[95] RAHMAN S, KHAN S, PORIKLI F. Zero-shot object detection:Learning to simultaneously recognize and localize novel concepts[C]//Asian Conference on Computer Vision, 2018:547-563.
[96] ZHU P, WANG H, SALIGRAMA V. Zero shot detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 30(4):998-1010.
[97] BANSAL A, SIKKA K, SHARMA G, et al. Zero-shot object detection[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018:384-400.
[98] DEMIREL B, CINBIS R G, IKIZLER-CINBIS N. Zero-shot object detection by hybrid region embedding[EB/OL]. (2018-5-16)[2019-12-20]. https://arxiv.xilesou.top/pdf/1805.06157.
[99] RAHMAN S, KHAN S, BARNES N. Polarity loss for zero-shot object detection[EB/OL]. (2018-11-22)[2019-12-20]. https://arxiv.xilesou.top/pdf/1811.08982.
Outlines

/