Electronics and Electrical Engineering and Control

Improved YOLOX object detection algorithm based on gradient difference adaptive learning rate optimization

  • Yucun SONG ,
  • Quanbo GE ,
  • Junlong ZHU ,
  • Zhenyu LU
Expand
  • 1.School of Artificial Intelligence/School of Future Technology,Nanjing University of Information Science and Technology,Nanjing  210044,China
    2.School of Automation,Nanjing University of Information Science and Technology,Nanjing  210044,China
    3.Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET),Nanjing University of Information Science and Technology,Nanjing  210044,China
    4.Jiangsu Province Engineering Research Center of Intelligent Meteorological Exploration Robot (C?IMER),Nanjing University of Information Science and Technology,Nanjing  210044,China
    5.College of Information Engineering,Henan University of Science and Technology,Luoyang  471000,China
E-mail: qbge_tju@163.com

Received date: 2022-08-29

  Revised date: 2022-10-18

  Accepted date: 2022-11-09

  Online published: 2022-11-17

Supported by

National Natural Science Foundation of China(62033010);Aeronautical Science Foundation of China(2019460T5001)

Abstract

Object detection has always been one of the most challenging problems in the field of computer vision, and is widely used in the tasks such as face recognition, autonomous driving and traffic detection. To further improve the performance of current mainstream object detection algorithms, this paper proposes an improved object detection algorithm based on YOLOX, and carries out experiments on the standard PASCAL VOC 07+12 and RSOD datasets. The YOLOX object detection algorithm is improved mainly through data enhancement, improving network structure and loss function. At the same time, an adaptive learning rate optimization algorithm based on gradient difference is proposed to train the improved YOLOX algorithm, which is also suitable for optimization of other neural networks. Experiments are carried out on PASCAL VOC 07+12 standard data sets. Results show that the AP of the improved YOLOX-S algorithm is increased from 61.63% to 66.35% compared with that of the original YOLOX-S algorithm. The improvement effect is obvious. Experiments are also carried out on the RSOD standard data set. The results show that the AP of the improved YOLOX-S algorithm is increased from 69.4% to 73.2% on the RSOD data set, compared with those of other mainstream YOLO series algorithms. The improvement effect is also significant. Experiments show effective improvement of YOLOX’s object detection.

Cite this article

Yucun SONG , Quanbo GE , Junlong ZHU , Zhenyu LU . Improved YOLOX object detection algorithm based on gradient difference adaptive learning rate optimization[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2023 , 44(14) : 327951 -327951 . DOI: 10.7527/S1000-6893.2022.27951

References

1 李柯泉, 陈燕, 刘佳晨, 等. 基于深度学习的目标检测算法综述[J]. 计算机工程202248(7): 1-12.
  LI K Q, CHEN Y, LIU J C, et al. Survey of deep learning-based object detection algorithms[J]. Computer Engineering202248(7): 1-12 (in Chinese).
2 GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2014: 580-587.
3 HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence201537(9): 1904-1916.
4 GIRSHICK R. Fast R-CNN[C]∥ 2015 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2016: 1440-1448.
5 REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence201739(6): 1137-1149.
6 REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]∥ 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 779-788.
7 LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[C]∥European Conference on Computer Vision. Cham: Springer, 2016: 21-37.
8 李红光, 于若男, 丁文锐. 基于深度学习的小目标检测研究进展[J]. 航空学报202142(7): 024691.
  LI H G, YU R N, DING W R. Research development of small object traching based on deep learning[J]. Acta Aeronautica et Astronautica Sinica202142(7): 024691 (in Chinese).
9 REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 6517-6525.
10 REDMON J, FARHADI A. YOLOv3: An incremental improvement[DB/OL]. arXiv preprint1804.02767, 2018.
11 BOCHKOVSKIY A, WANG C Y, LIAO H. YOLOv4: Optimal speed and accuracy of object detection[DB/OL]. arXiv preprint2004.10934, 2020.
12 JOCHER G, NISHIMURA K, MINEEVA T, et al. YOLOv5[EB/OL]. 2020. .
13 GE Z, LIU S T, WANG F, et al. YOLOX: Exceeding YOLO series in 2021[DB/OL]. arXiv preprint: 2107. 08430, 2021.
14 江波, 屈若锟, 李彦冬, 等. 基于深度学习的无人机航拍目标检测研究综述[J]. 航空学报202142(4): 524519.
  JIANG B, QU R K, LI Y D, et al. Object detection in UAV imagery based on deep learning: Review[J]. Acta Aeronautica et Astronautica Sinica202142(4): 524519 (in Chinese).
15 葛泉波, 张建朝, 杨秦敏, 等. 带有微分项改进的自适应梯度下降优化算法[J]. 控制理论与应用202239(4): 623-632.
  GE Q B, ZHANG J C, YANG Q M, et al. Adaptive gradient descent optimization algorithm with improved differential term[J]. Control Theory & Applications202239(4): 623-632 (in Chinese).
16 SINHA N K, GRISCIK M P. A stochastic approximation method[J]. IEEE Transactions on Systems, Man, and Cybernetics1971, SMC-1(4): 338-344.
17 杨晗. 深度学习中一阶优化算法研究[D]. 北京: 北京邮电大学, 2021: 10-19.
  YANG H. Research on first-order optimization algorithm in deep learning[D]. Beijing: Beijing University of Posts and Telecommunications, 2021: 10-19 (in Chinese).
18 刘克刚. 基于二阶信息的优化算法[D]. 上海: 华东师范大学, 2020: 7-13.
  LIU K G. Optimization algorithm based on second-order information[D]. Shanghai: East China Normal University, 2020: 7-13 (in Chinese).
19 DUCHI J C, HAZAN E, SINGER Y. Adaptive subgradient methods for online learning and stochastic optimization[J]. Journal of Machine Learning Research201112: 2121-2159.
20 ZEILER M D. ADADELTA: An adaptive learning rate method[DB/OL]. arXiv preprint: 1212.5701, 2012.
21 GRAVES A. Generating sequences with recurrent neural networks[DB/OL]. arXiv preprint: 1308.0850, 2013.
22 KINGMA D P, BA J. Adam: A method for stochastic optimization[DB/OL]. arXiv preprint: 1412.6980, 2014.
23 LUO L C, XIONG Y H, LIU Y, et al. Adaptive gradient methods with dynamic bound of learning rate[DB/OL]. arXiv preprint1902.09843, 2019.
24 ZHUANG J, TANG T, DING Y, et al. Adabelief optimi-zer: Adapting stepsizes by the belief in observed gradients[J]. Advances in Neural Information Processing Systems202033: 18795-18806.
25 SHAO Z, LIN T. A new adaptive gradient method with gradient decomposition[DB/OL]. arXiv preprint: 2107.08377, 2021.
26 ZHANG H Y, CISSE M, DAUPHIN Y N, et al. Mixup: Beyond empirical risk minimization[DB/OL]. arXiv preprint: 1710.09412, 2017.
27 DUBEY S R, CHAKRABORTY S, ROY S K, et al. diffGrad: An optimization method for convolutional neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems202031(11): 4500-4511.
28 ELFWING S, UCHIBE E, DOYA K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning[J]. Neural Networks2018107: 3-11.
29 LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]∥ IEEE Transactions on Pattern Analysis and Machine Intelligence. Piscataway: IEEE Press, 2018: 318-327.
30 EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision201088(2): 303-338.
31 GLOROT X, BORDES A, BENGIO Y. Deep sparse rectifier neural networks[C]∥Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, 2011: 315-323.
32 XIAO Z F, LIU Q, TANG G F, et al. Elliptic Fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images[J]. International Journal of Remote Sensing201536(2): 618-644.
33 SUTSKEVER I, MARTENS J, DAHL G, et al. On the importance of initialization and momentum in deep learning[C]∥ Proceedings of the 30th International Conference on International Conference on Machine Learning-Volume 28. New York: ACM, 2013: III-1139-III.
34 LI C Y, LI L L, JIANG H L, et al. YOLOv6: A single-stage object detection framework for industrial applications[DB/OL]. arXiv preprint: 2209. 02976, 2022.
35 KRIZHEVSKY A. Learning multiple layers of features from tiny images: TR-2009[R]. Toronto: University of Toronto, 2009.
36 SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[DB/OL]. arXiv preprint: 1409.1556, 2014.
37 HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]∥ 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2016: 770-778.
38 HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]∥ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 2261-2269.
Outlines

/