[1] 朱华勇, 牛轶峰, 沈林成, 等. 无人机系统自主控制技术研究现状与发展趋势[J].国防科技大学学报,2010,32(3):115-120. ZHU H Y, NIU Y F, SHEN L C, et al. State of the art and trends of autonomous control of UAV systems[J]. Journal of National University of Defense Technology, 2010,32(3):115-120(in Chinese). [2] 宋闯, 赵佳佳, 王康, 等. 面向智能感知的小样本学习研究综述[J].航空学报,2020,41(S2):723756. SONG C, ZHAO J J, WANG K, et al. Few shot learning based intelligent perception:A survey[J].Acta Aeronautica et Astronautica Sinica, 2020,41(S2):723756(in Chinese). [3] 李诚龙, 屈文秋, 李彦冬, 等. 面向eVTOL航空器的城市空中运输交通管理综述[J].交通运输工程学报,2020,20(4):35-54. LI C L, QU W Q, LI Y D, et al. Overview on traffic management of urban air mobility(UAM) with eVTOL aircraft[J]. Journal of Traffic and Transportation Engineering,2020,20(4):35-54(in Chinese). [4] 石叶楠, 郑国磊. 三种用于加工特征识别的神经网络方法综述[J].航空学报,2019,40(9):182-198. SHI Y N, ZHENG G L. A review of three neural network methods for manufacturing feature recognition[J]. Acta Aeronautica et Astronautica Sinica, 2019,40(9):182-198(in Chinese). [5] 李彦冬, 郝宗波, 雷航. 卷积神经网络研究综述[J].计算机应用,2016,36(9):2508-2515,2565. LI Y D, HAO Z B, LEI H. Survey of convolutional neural network[J]. Journal of Computer Applications, 2016,36(9):2508-2515,2565(in Chinese). [6] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2):91-110. [7] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//CVPR 2005:Proceedings of the 2005 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2005:886-893. [8] LECUN Y, BOTTOU L. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324. [9] KRIZHEVSKY A, SUTSKEVER I, HINTON G. ImageNet classification with deep convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2012, 25(2):1097-1105. [10] DENG J, DONG W, SOCHER R, et al. Imagenet:A large-scale hierarchical image database[C]//CVPR 2009:Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2009:248-255. [11] LIN M, CHEN Q, YAN S. Network in network[DB/OL]. ArXiv Preprint:1312.4400, 2013. [12] ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]//ECCV 2014:2014 European Conference on Computer Vision. Berlin:Springer, 2014:818-833. [13] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[DB/OL]. ArXiv Preprint:1409.1556, 2014. [14] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//CVPR 2015:Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2015:1-9. [15] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//CVPR 2016:Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2016:770-778. [16] HUANG G, LIU Z, DER MAATEN L V, et al. Densely connected convolutional networks[C]//CVPR 2017:Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2017:2261-2269. [17] HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[C]//CVPR 2018:Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2018:7132-7141. [18] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//CVPR 2014:Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2014:580-587. [19] HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1904-1916. [20] GIRSHICK R. Fast R-CNN[C]//CVPR 2015:Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2015:1440-1448. [21] REN S, HE K, GIRSHICK R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149. [22] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once:Unified, real-time object detection[C]//CVPR 2016:Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2016:779-788. [23] LIU W, ANGUELOV D, ERHAN D, et al. SSD:Single shot multibox detector[C]//ECCV 2016:2016 European Conference on Computer Vision. Berlin:Springer, 2016:21-37. [24] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco:Common objects in context[C]//ECCV 2014:2014 European Conference on Computer Vision. Berlin:Springer, 2014:740-755. [25] EVERINGHAM M, ESLAMI S M, VAN GOOL L, et al. The pascal visual object classes challenge:A retrospective[J]. International Journal of Computer Vision, 2015, 111(1):98-136. [26] YANG Y, NEWSAM S. Bag-of-visual-words and spatial extensions for land-use classification[C]//Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. New York:Association for Computing Machinery, 2010:270-279. [27] CHENG G, ZHOU P, HAN J, et al. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(12):7405-7415. [28] RAZAKARIVONY S, JURIE F. Vehicle detection in aerial imagery[J]. Journal of Visual Communication and Image Representation, 2016,34(C):187-203. [29] ZHU H, CHEN X, DAI W, et al. Orientation robust object detection in aerial images using deep convolutional neural network[C]//2015 IEEE International Conference on Image Processing. Piscataway, NJ:IEEE Press, 2015:3735-3739. [30] XIA G S, BAI X, DING J, et al. DOTA:A large-scale dataset for object detection in aerial images[C]//CVPR 2018:Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2018:3974-3983. [31] ROBICQUET A, SADEGHIAN A, ALAHI A, et al. Learning social etiquette:Human trajectory understanding in crowded scenes[C]//ECCV 2016:2016 European Conference on Computer Vision. Berlin:Springer, 2016:549-565. [32] BAREKATAIN M, MARTI M, SHIH H, et al. Okutama-Action:an aerial view video dataset for concurrent human action detection[C]//CVPR 2017:Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2017:2153-2160. [33] HSIEH M, LIN Y, HSU W H, et al. Drone-based object counting by spatially regularized regional proposal network[C]//ICCV 2017:Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE, 2017:4165-4173. [34] ZHU P, WEN L, BIAN X, et al. Vision meets drones:a challenge[DB/OL]. ArXiv Preprint:1804.07437, 2018. [35] ZHU P, SUN Y, WEN L, et al. Drone based RGBT vehicle detection and counting:a challenge[DB/OL]. ArXiv Preprint:2003.02437, 2020. [36] TORRALBA A, EFROS A A. Unbiased look at dataset bias[C]//CVPR 2011:Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2011:1521-1528. [37] PAN S J, YANG Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10):1345-1359. [38] PAN B, TAI J, ZHENG Q, et al. Cascade convolutional neural network based on transfer-learning for aircraft detection on high-resolution remote sensing images[J]. Journal of Sensors, 2017:1-14. [39] 袁功霖, 侯静, 尹奎英. 基于迁移学习与图像增强的夜间航拍车辆识别方法[J].计算机辅助设计与图形学学报,2019,31(3):467-473. YUAN G L, HOU J, YIN K Y. Night-time aerial image vehicle recognition technology based on transfer learning and image enhancement[J]. Journal of Computer-Aided Design & Computer Graphics, 2019,31(3):467-473(in Chinese). [40] 王泽隆, 徐向辉, 张雷. 基于仿真SAR图像深度迁移学习的自动目标识别[J].中国科学院大学学报,2020,37(4):516-524. WANG Z L, XU X H, ZHANG L. Study of deep transfer learning for SAR ATR based on simulated SAR images[J]. Journal of University of Chinese Academy of Sciences, 2020,37(4):516-524(in Chinese). [41] ZAMIR A R, SAX A, SHEN W B, et al. Taskonomy:disentangling task transfer learning[C]//CVPR 2018:Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2018:3712-3722. [42] YOSINSKI J, CLUNE J, BENGIO Y, et al. How transferable are features in deep neural networks?[C]//International Conference on Neural Information Processing Systems, Cambridge:MIT Press, 2014:3320-3328. [43] AUDEBERT N, SAUX B L, LEFEVRE S, et al. Segment-before-detect:vehicle detection and classification through semantic segmentation of aerial images[J]. Remote Sensing, 2017, 9(4):368-386. [44] HE K, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2):386-397. [45] CHEN L C, HERMANS A, PAPANDREOU G, et al. MaskLab:instance segmentation by refining object detection with semantic and direction features[C]//CVPR 2018:Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2018:4013-4022. [46] CHEN K, PANG J, WANG J, et al. Hybrid task cascade for instance segmentation[C]//CVPR 2019:Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2019:4974-4983. [47] LI C, XU C, CUI Z, et al. Learning object-wise semantic representation for detection in remote sensing imagery[C]//CVPR 2019:Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2019:20-27. [48] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[DB/OL]. ArXiv Preprint:1706.05587, 2017. [49] 张瑞倩, 邵振峰, ALEKSEI P, 等. 多尺度空洞卷积的无人机影像目标检测方法[J].武汉大学学报(信息科学版),2020,45(6):895-903. ZHANG R Q, SHAO Z F, ALEKSEI PORTNOV, et al. Multi? scale dilated convolutional neural network for object detection in UAV images[J]. Geomatics and Information Science of Wuhan University, 2020,45(6):895-903(in Chinese). [50] YANG X, YANG J, YAN J, et al. SCRDet:towards more robust detection for small, cluttered and rotated objects[C]//ICCV 2018:Proceedings of the 2018 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2019:8232-8241. [51] SEVO I, AVRAMOVIC A. Convolutional neural network based automatic object detection on aerial images[J]. IEEE Geoscience and Remote Sensing Letters, 2016, 13(5):740-744. [52] SOMMER L W, SCHUCHERT T, BEYERER J. Fast deep vehicle detection in aerial images[C]//WACV 2017:2017 IEEE Winter Conference on Applications of Computer Vision. Washington, D.C.:IEEE Computer Society, 2017:311-319. [53] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//CVPR 2017:Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2017:2117-2125. [54] AZIMI S M, VIG E, BAHMANYAR R, et al. Towards multi-class object detection in unconstrained remote sensing imagery[C]//Asian Conference on Computer Vision. Berlin:Springer, 2018:150-165. [55] DAI J, QI H, XIONG Y, et al. Deformable convolutional networks[C]//ICCV 2017:Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2017:764-773. [56] YANG X, SUN H, FU K, et al. Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks[J]. Remote Sensing, 2018, 10(1):132-146. [57] WANG J, DING J, GUO H, et al. Mask OBB:A semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images[J]. Remote Sensing, 2019, 11(24):2930-2951. [58] 刘芳, 吴志威, 杨安喆, 等. 基于多尺度特征融合的自适应无人机目标检测[J].光学学报,2020,40(10):133-142. LIU F, WU Z W, YANG A Z, et al. Multi-scale feature fusion based adaptive object detection for UAV[J]. Acta Optica Sinaica, 2020,40(10):133-142(in Chinese). [59] HE K, GIRSHICK R, DOLLáR P. Rethinking imagenet pre-training[C]//ICCV 2019:Proceedings of the 2019 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2019:4918-4927. [60] ZHU R, ZHANG S, WANG X, et al. ScratchDet:Rraining single-shot object detectors from scratch[C]//CVPR 2019:Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2019:2268-2277. [61] WANG T, ANWER R M, CHOLAKKAL H, et al. Learning rich features at high-speed for single-shot object detection[C]//ICCV 2019:Proceedings of the 2019 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2019:1971-1980. [62] YU X, GONG Y, JIANG N, et al. Scale match for tiny person detection[C]//WACV 2020:2020 IEEE Winter Conference on Applications of Computer Vision. Washington, D.C.:IEEE Computer Society, 2020:1257-1265. [63] 刘颖, 刘红燕, 范九伦, 等. 基于深度学习的小目标检测研究与应用综述[J].电子学报,2020,48(3):590-601. LIU Y, LIU H Y, FAN J L, et al. A survey of research and application of small object detection based on deep learning[J]. Acta Electronica Sinica, 2020,48(3):590-601(in Chinese). [64] LALONDE R, ZHANG D, SHAH M. ClusterNet:detecting small objects in large scenes by exploiting spatio-temporal information[C]//CVPR 2018:Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2018:4003-4012. [65] YANG F, FAN H, CHU P, et al. Clustered object detection in aerial images[C]//ICCV 2019:Proceedings of the 2019 IEEE International Conference on Computer Vision. Piscataway:IEEE Press, 2019:8311-8320. [66] GAO M, YU R, LI A, et al. Dynamic zoom-in network for fast object detection in large images[C]//CVPR 2018:Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2018:6926-6935. [67] UZKENT B, YEH C, ERMON S. Efficient object detection in large images using deep reinforcement learning[C]//WACV 2020:2020 IEEE Winter Conference on Applications of Computer Vision. Washington, D.C.:IEEE Computer Society, 2020:1824-1833. [68] BOČICSTULIC D, MARUSIC Č, GOTOVAC S, et al. Deep learning approach in aerial imagery for supporting land search and rescue missions[J]. International Journal of Computer Vision, 2019, 127(9):1256-1278. [69] JIANG Y, ZHU X, WANG X, et al. R2CNN:Rotational region CNN for orientation robust scene text detection[DB/OL]. ArXiv Preprint:1706.09579, 2017. [70] XU Y, FU M, WANG Q, et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection[DB/OL]. ArXiv Preprint:1911.09358v2, 2020. [71] MA J, SHAO W, YE H, et al. Arbitrary-oriented scene text detection via rotation proposals[J]. IEEE Transactions on Multimedia, 2018, 20(11):3111-3122. [72] DING J, XUE N, LONG Y, et al. Learning ROI transformer for oriented object detection in aerial images[C]//CVPR 2019:Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2019:2849-2858. [73] ZHOU X, WANG D, KRAHENBUHL P, et al. Objects as points[DB/OL]. ArXiv Preprint:1904.07850, 2019. [74] PAN X, REN Y, SHENG K, et al. Dynamic refinement network for oriented and densely packed object detection[C]//CVPR 2020:Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2020:11207-11216. [75] CHENG G, HAN J, ZHOU P, et al. Learning rotation-invariant and fisher discriminative convolutional neural networks for object detection[J]. IEEE Transactions on Image Processing, 2019, 28(1):265-278. [76] YANG M, YU K, ZHANG C, et al. DenseASPP for semantic segmentation in street scenes[C]//CVPR 2018:Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2018:3684-3692. [77] GUO C, FAN B, ZHANG Q, et al. AugFPN:Improving multi-scale feature learning for object detection.[C]//CVPR 2020:Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2020:12595-12604. [78] DONG Z, LI G, LIAO Y, et al. CentripetalNet:Pursuing high-quality keypoint pairs for object detection.[C]//CVPR 2020:Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D.C.:IEEE Computer Society, 2020:10519-10528. |