航空学报 > 2025, Vol. 46 Issue (12): 331420-331420   doi: 10.7527/S1000-6893.2024.31420

基于深度强化学习的太阳能无人机航迹规划

余子杰1, 郑征1(), 李清东1, 郭林2, 任素萍2, 郭健3   

  1. 1.北京航空航天大学 自动化科学与电气工程学院,北京 100191
    2.中国航天空气动力技术研究院,北京 100074
    3.中国煤炭科工集团有限公司,北京 100028
  • 收稿日期:2024-10-21 修回日期:2024-11-08 接受日期:2024-12-10 出版日期:2025-01-07 发布日期:2024-12-30
  • 通讯作者: 郑征 E-mail:zhengz@buaa.edu.cn
  • 基金资助:
    国家自然科学基金(62372021)

Trajectory planning for solar-powered UAVs based on deep reinforcement learning

Zijie YU1, Zheng ZHENG1(), Qingdong LI1, Lin GUO2, Suping REN2, Jian GUO3   

  1. 1.School of Automation Science and Electrical Engineering,Beihang University,Beijing 100191,China
    2.China Aerospace Aerodynamics Research Institute,Beijing 100074,China
    3.China Coal Science and Engineering Group Corporation,Beijing 100028,China
  • Received:2024-10-21 Revised:2024-11-08 Accepted:2024-12-10 Online:2025-01-07 Published:2024-12-30
  • Contact: Zheng ZHENG E-mail:zhengz@buaa.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62372021)

摘要:

高空长航时太阳能无人机(HALE-SUAV)通过合理的航迹规划可以极大提升其续航性能,而深度强化学习方法由于实时性与自适应性成为该航迹规划问题的理想选择。针对基于深度强化学习方法的HALE-SUAV航迹规划问题,建立了无人机的运动学与动力学模型以及能量相关模型,设计了其能量管理策略,搭建了该航迹规划问题的深度强化学习整体框架,并最终使用训练出来的模型进行了不同太阳能辐射强度情况下的航迹规划实验。研究结果表示基于所提的深度强化学习方法,HALE-SUAV能够选择基于当前太阳能辐射强度情况下合理的控制指令,以提高其续航性能。研究结果显示了深度强化学习方法在HALE-SUAV航迹规划问题的潜在应用价值。

关键词: 深度强化学习, 高空长航时太阳能无人机, 航迹规划, 续航性能, 能量管理策略

Abstract:

High Altitude Long Endurance Solar-powered Unmanned Aerial Vehicles (HALE-SUAV) can significantly enhance the endurance performance through well-designed trajectory planning. Deep Reinforcement Learning (DRL) methods are ideal for this trajectory planning problem due to their real-time performance and adaptability. To address the HALE-SUAV trajectory planning problem based on DRL, this paper establishes the kinematics and dynamics models of the UAV, along with energy-related models, designs its energy management strategy, constructs the overall DRL framework for this trajectory planning problem, and ultimately conducts trajectory planning experiments under different solar radiation intensities using the trained model. The research results indicate that, based on the DRL method proposed in this paper, HALE-SUAVs can select reasonable control commands based on current solar radiation intensities to improve their endurance performance. The findings demonstrate the potential application value of DRL methods in HALE-SUAV trajectory planning problems.

Key words: deep reinforcement learning, HALE-SUAV, trajectory planning, endurance performance, energy management strategy

中图分类号: