局部引导强化学习的舰载机自主调运方法

  • 王政 ,
  • 王华 ,
  • 崔可可 ,
  • 李超超 ,
  • 刘俊楠 ,
  • 徐明亮
展开
  • 1. 郑州大学
    2. 郑州大学 计算机与人工智能学院

收稿日期: 2024-10-08

  修回日期: 2025-03-09

  网络出版日期: 2025-03-12

基金资助

国家重点研发计划;国家自然科学基金

摘要

甲板空间有限且环境动态多变使得舰载机自主调运存在较大的挑战。现有基于强化学习的自动泊车技术为舰载机自主调运提供了新的技术思路,但上述方法直接用于舰载机这一动态环境且姿态受限下的自主调运时,存在不收敛的问题。鉴于此,本文提出了一种局部引导强化学习的舰载机自主调运方法,通过引入基于预规划轨迹的局部目标状态奖励和接近调运终点附近的局部状态网格奖励来引导舰载机学习过程,避免了训练过程中出现局部最优解和收敛失败的问题,从而显著提升了舰载机自主调运成功率。实验结果表明,所提出的自主调运方法在成功率、安全性方面均优于传统的自主调运方法,并已在多种任务场景和不同数量的舰载机配置下得到了验证。

本文引用格式

王政 , 王华 , 崔可可 , 李超超 , 刘俊楠 , 徐明亮 . 局部引导强化学习的舰载机自主调运方法[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2024.31333

参考文献

[1]Xinwei W. A. N. G., L. I. U. Jie, S. U. Xichao, et al. A review on carrier aircraft dispatch path planning and control on deck[J]. Chinese Journal of Aeronautics, 2020, 3312: 3039-3057.
[2]刘洁 韩., 徐卫国, 刘纯, 袁培龙, 陈志刚, 彭海军. 基于滚动时域的舰载机甲板运动轨迹跟踪最优控制[J]. 航空学报, 2019, 408: 322842-322842.
[3]Li Y., Y. Wu, X. Su, et al. Path planning for aircraft fleet launching on the flight deck of carriers[J]. Mathematics, 2018, 610: 175.
[4]Su X., Z. Li, J. Song, et al. A path planning method for carrier aircraft on deck combining artificial experience and intelligent search. Conference Proceedings. 2018, 381: 012194.
[5]Zhang P., L. Xiong, Z. Yu, et al. Reinforcement learning-based end-to-end parking for automatic parking system[J]. Sensors, 2019, 1918: 3996.
[6]Song S., H. Chen, H. Sun, et al. Data efficient reinforcement learning for integrated lateral planning and control in automated parking system[J]. Sensors, 2020, 2024: 7297.
[7]Chen S., M. Wang, Y. Yang, et al. Conflict-constrained multi-agent reinforcement learning method for parking trajectory planning. Conference Proceedings. 2023, 9421-9427.
[8]张智 林., 朱齐丹, 王开宇. 考虑运动学约束的不规则目标遗传避碰规划算法[J]. 航空学报, 2015, 364: 1348-1358.
[9]Zhang J., J. Yu, X. Qu, et al. Path planning for carrier aircraft based on geometry and dijkstra's algorithm. Conference Proceedings. 2017, 115-119.
[10]Wu Y. and X. Qu. Obstacle avoidance and path planning for carrier aircraft launching[J]. Chinese Journal of Aeronautics, 2015, 283: 695-703.
[11]Jie L. I. U., D. O. N. G. Xianzhou, W. A. N. G. Xinwei, et al. A homogenization-planning-tracking method to solve cooperative autonomous motion control for heterogeneous carrier dispatch systems[J]. Chinese Journal of Aeronautics, 2022, 359: 293-305.
[12]Wang X.-w., H.-j. Peng, J. Liu, et al. Optimal control based coordinated taxiing path planning and tracking for multiple carrier aircraft on flight deck[J]. Defence Technology, 2022, 182: 238-248.
[13]Liu J., W. Han, X. Wang, et al. Research on Cooperative Trajectory Planning and Tracking Problem for Multiple Carrier Aircraft on the Deck[J]. IEEE Systems Journal, 2020, 142: 3027-3038.
[14]Runqi Chai A. T., Al Savvaris, Senchun Chai,Yuanqing Xia,and C. L. Philip Chen. Design and Implementation of Deep Neural
Network-Based Control for Automatic Parking Maneuver Process[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022:
[15]Runqi Chai D. L., Tianhao Liu, Antonios Tsourdos,Yuanqing Xia,and Senchun Chai. Deep Learning-Based Trajectory Planning and Control for Autonomous Ground Vehicle Parking Maneuver[J]. IEEETransactionsonAutomationScienceandEngineering, 2023:
[16]Shaoyu Song H. C., Hongwei Sun and Meicen Liu. Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System[J]. sensors, 2020:
[17]Peizhi Zhang L. X., Zhuoping Yu, Peiyuan Fang, Senwei Yan, Jie Yao and Yi Zhou. Reinforcement Learning-Based End-to-End Parking for Automatic Parking System[J]. Sensors, 2019:
[18]Liu J., W. Han, C. Liu, et al. A New Method for the Optimal Control Problem of Path Planning for Unmanned Ground Systems[J]. IEEE Access, 2018, 6: 33251-33260.
[19]Reinforcement learning: An introduction[M]. MIT press, 2018:
[20]Panayiotou A., T. Kyriakou, M. Lemonari, et al. CCP: Configurable Crowd Profiles. Conference Proceedings. 2022,
[21]Wang H., X.-Y. Guo, H. Tao, et al. Collective Movement Simulation: Methods and Applications[J]. Machine Intelligence Research, 2024: 1-29.
[22]Dolgov D., S. Thrun, M. Montemerlo, et al. Practical search techniques in path planning for autonomous driving[J]. Ann Arbor, 2008, 100148105: 18-80.
[23]Schulman J., F. Wolski, P. Dhariwal, et al. Proximal policy optimization algorithms[J]. arXiv preprint arXiv:1707.06347, 2017:
[24]Bengio Y., J. Louradour, R. Collobert, et al. Curriculum learning. Conference Proceedings. 2009, 41-48.
文章导航

/