ACTA AERONAUTICAET ASTRONAUTICA SINICA >
Improved YOLOX object detection algorithm based on gradient difference adaptive learning rate optimization
Received date: 2022-08-29
Revised date: 2022-10-18
Accepted date: 2022-11-09
Online published: 2022-11-17
Supported by
National Natural Science Foundation of China(62033010);Aeronautical Science Foundation of China(2019460T5001)
Object detection has always been one of the most challenging problems in the field of computer vision, and is widely used in the tasks such as face recognition, autonomous driving and traffic detection. To further improve the performance of current mainstream object detection algorithms, this paper proposes an improved object detection algorithm based on YOLOX, and carries out experiments on the standard PASCAL VOC 07+12 and RSOD datasets. The YOLOX object detection algorithm is improved mainly through data enhancement, improving network structure and loss function. At the same time, an adaptive learning rate optimization algorithm based on gradient difference is proposed to train the improved YOLOX algorithm, which is also suitable for optimization of other neural networks. Experiments are carried out on PASCAL VOC 07+12 standard data sets. Results show that the AP of the improved YOLOX-S algorithm is increased from 61.63% to 66.35% compared with that of the original YOLOX-S algorithm. The improvement effect is obvious. Experiments are also carried out on the RSOD standard data set. The results show that the AP of the improved YOLOX-S algorithm is increased from 69.4% to 73.2% on the RSOD data set, compared with those of other mainstream YOLO series algorithms. The improvement effect is also significant. Experiments show effective improvement of YOLOX’s object detection.
Key words: object detection; YOLOX; neural network optimization; PASCAL VOC; RSOD
Yucun SONG , Quanbo GE , Junlong ZHU , Zhenyu LU . Improved YOLOX object detection algorithm based on gradient difference adaptive learning rate optimization[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023 , 44(14) : 327951 -327951 . DOI: 10.7527/S1000-6893.2022.27951
1 | 李柯泉, 陈燕, 刘佳晨, 等. 基于深度学习的目标检测算法综述[J]. 计算机工程, 2022, 48(7): 1-12. |
LI K Q, CHEN Y, LIU J C, et al. Survey of deep learning-based object detection algorithms[J]. Computer Engineering, 2022, 48(7): 1-12 (in Chinese). | |
2 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2014: 580-587. |
3 | HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. |
4 | GIRSHICK R. Fast R-CNN[C]∥ 2015 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2016: 1440-1448. |
5 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. |
6 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]∥ 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 779-788. |
7 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[C]∥European Conference on Computer Vision. Cham: Springer, 2016: 21-37. |
8 | 李红光, 于若男, 丁文锐. 基于深度学习的小目标检测研究进展[J]. 航空学报, 2021, 42(7): 024691. |
LI H G, YU R N, DING W R. Research development of small object traching based on deep learning[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(7): 024691 (in Chinese). | |
9 | REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 6517-6525. |
10 | REDMON J, FARHADI A. YOLOv3: An incremental improvement[DB/OL]. arXiv preprint: 1804.02767, 2018. |
11 | BOCHKOVSKIY A, WANG C Y, LIAO H. YOLOv4: Optimal speed and accuracy of object detection[DB/OL]. arXiv preprint: 2004.10934, 2020. |
12 | JOCHER G, NISHIMURA K, MINEEVA T, et al. YOLOv5[EB/OL]. 2020. . |
13 | GE Z, LIU S T, WANG F, et al. YOLOX: Exceeding YOLO series in 2021[DB/OL]. arXiv preprint: 2107. 08430, 2021. |
14 | 江波, 屈若锟, 李彦冬, 等. 基于深度学习的无人机航拍目标检测研究综述[J]. 航空学报, 2021, 42(4): 524519. |
JIANG B, QU R K, LI Y D, et al. Object detection in UAV imagery based on deep learning: Review[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(4): 524519 (in Chinese). | |
15 | 葛泉波, 张建朝, 杨秦敏, 等. 带有微分项改进的自适应梯度下降优化算法[J]. 控制理论与应用, 2022, 39(4): 623-632. |
GE Q B, ZHANG J C, YANG Q M, et al. Adaptive gradient descent optimization algorithm with improved differential term[J]. Control Theory & Applications, 2022, 39(4): 623-632 (in Chinese). | |
16 | SINHA N K, GRISCIK M P. A stochastic approximation method[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1971, SMC-1(4): 338-344. |
17 | 杨晗. 深度学习中一阶优化算法研究[D]. 北京: 北京邮电大学, 2021: 10-19. |
YANG H. Research on first-order optimization algorithm in deep learning[D]. Beijing: Beijing University of Posts and Telecommunications, 2021: 10-19 (in Chinese). | |
18 | 刘克刚. 基于二阶信息的优化算法[D]. 上海: 华东师范大学, 2020: 7-13. |
LIU K G. Optimization algorithm based on second-order information[D]. Shanghai: East China Normal University, 2020: 7-13 (in Chinese). | |
19 | DUCHI J C, HAZAN E, SINGER Y. Adaptive subgradient methods for online learning and stochastic optimization[J]. Journal of Machine Learning Research, 2011, 12: 2121-2159. |
20 | ZEILER M D. ADADELTA: An adaptive learning rate method[DB/OL]. arXiv preprint: 1212.5701, 2012. |
21 | GRAVES A. Generating sequences with recurrent neural networks[DB/OL]. arXiv preprint: 1308.0850, 2013. |
22 | KINGMA D P, BA J. Adam: A method for stochastic optimization[DB/OL]. arXiv preprint: 1412.6980, 2014. |
23 | LUO L C, XIONG Y H, LIU Y, et al. Adaptive gradient methods with dynamic bound of learning rate[DB/OL]. arXiv preprint: 1902.09843, 2019. |
24 | ZHUANG J, TANG T, DING Y, et al. Adabelief optimi-zer: Adapting stepsizes by the belief in observed gradients[J]. Advances in Neural Information Processing Systems, 2020, 33: 18795-18806. |
25 | SHAO Z, LIN T. A new adaptive gradient method with gradient decomposition[DB/OL]. arXiv preprint: 2107.08377, 2021. |
26 | ZHANG H Y, CISSE M, DAUPHIN Y N, et al. Mixup: Beyond empirical risk minimization[DB/OL]. arXiv preprint: 1710.09412, 2017. |
27 | DUBEY S R, CHAKRABORTY S, ROY S K, et al. diffGrad: An optimization method for convolutional neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(11): 4500-4511. |
28 | ELFWING S, UCHIBE E, DOYA K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning[J]. Neural Networks, 2018, 107: 3-11. |
29 | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]∥ IEEE Transactions on Pattern Analysis and Machine Intelligence. Piscataway: IEEE Press, 2018: 318-327. |
30 | EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-338. |
31 | GLOROT X, BORDES A, BENGIO Y. Deep sparse rectifier neural networks[C]∥Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, 2011: 315-323. |
32 | XIAO Z F, LIU Q, TANG G F, et al. Elliptic Fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images[J]. International Journal of Remote Sensing, 2015, 36(2): 618-644. |
33 | SUTSKEVER I, MARTENS J, DAHL G, et al. On the importance of initialization and momentum in deep learning[C]∥ Proceedings of the 30th International Conference on International Conference on Machine Learning-Volume 28. New York: ACM, 2013: III-1139-III. |
34 | LI C Y, LI L L, JIANG H L, et al. YOLOv6: A single-stage object detection framework for industrial applications[DB/OL]. arXiv preprint: 2209. 02976, 2022. |
35 | KRIZHEVSKY A. Learning multiple layers of features from tiny images: TR-2009[R]. Toronto: University of Toronto, 2009. |
36 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[DB/OL]. arXiv preprint: 1409.1556, 2014. |
37 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]∥ 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 770-778. |
38 | HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 2261-2269. |
/
〈 |
|
〉 |