基于深度强化学习的太阳能无人机航迹规划

余子杰; 郑征; 李清东; 郭林; 任素萍; 郭健

doi:10.7527/S1000-6893.2024.31420

航空学报 >

2025 , Vol. 46 >Issue 12: 331420 - 331420

DOI: https://doi.org/10.7527/S1000-6893.2024.31420

电子电气工程与控制

基于深度强化学习的太阳能无人机航迹规划

余子杰 ,
郑征 ,
李清东 ,
郭林 ,
任素萍 ,
郭健

展开

^1.北京航空航天大学自动化科学与电气工程学院，北京 100191
^2.中国航天空气动力技术研究院，北京 100074
^3.中国煤炭科工集团有限公司，北京 100028

E-mail： zhengz@buaa.edu.cn

收稿日期: 2024-10-21

修回日期: 2024-11-08

录用日期: 2024-12-10

网络出版日期: 2024-12-30

基金资助

国家自然科学基金(62372021)

收起

Trajectory planning for solar-powered UAVs based on deep reinforcement learning

Zijie YU ,
Zheng ZHENG ,
Qingdong LI ,
Lin GUO ,
Suping REN ,
Jian GUO

Expand

^1.School of Automation Science and Electrical Engineering，Beihang University，Beijing 100191，China
^2.China Aerospace Aerodynamics Research Institute，Beijing 100074，China
^3.China Coal Science and Engineering Group Corporation，Beijing 100028，China

E-mail： zhengz@buaa.edu.cn

Received date: 2024-10-21

Revised date: 2024-11-08

Accepted date: 2024-12-10

Online published: 2024-12-30

Supported by

National Natural Science Foundation of China(62372021)

Fold

摘要

高空长航时太阳能无人机（HALE-SUAV）通过合理的航迹规划可以极大提升其续航性能，而深度强化学习方法由于实时性与自适应性成为该航迹规划问题的理想选择。针对基于深度强化学习方法的HALE-SUAV航迹规划问题，建立了无人机的运动学与动力学模型以及能量相关模型，设计了其能量管理策略，搭建了该航迹规划问题的深度强化学习整体框架，并最终使用训练出来的模型进行了不同太阳能辐射强度情况下的航迹规划实验。研究结果表示基于所提的深度强化学习方法，HALE-SUAV能够选择基于当前太阳能辐射强度情况下合理的控制指令，以提高其续航性能。研究结果显示了深度强化学习方法在HALE-SUAV航迹规划问题的潜在应用价值。

关键词： 深度强化学习; 高空长航时太阳能无人机; 航迹规划; 续航性能; 能量管理策略

本文引用格式

余子杰 , 郑征 , 李清东 , 郭林 , 任素萍 , 郭健 . 基于深度强化学习的太阳能无人机航迹规划[J]. 航空学报, 2025 , 46(12) : 331420 -331420 . DOI: 10.7527/S1000-6893.2024.31420

Abstract

High Altitude Long Endurance Solar-powered Unmanned Aerial Vehicles （HALE-SUAV） can significantly enhance the endurance performance through well-designed trajectory planning. Deep Reinforcement Learning （DRL） methods are ideal for this trajectory planning problem due to their real-time performance and adaptability. To address the HALE-SUAV trajectory planning problem based on DRL， this paper establishes the kinematics and dynamics models of the UAV， along with energy-related models， designs its energy management strategy， constructs the overall DRL framework for this trajectory planning problem， and ultimately conducts trajectory planning experiments under different solar radiation intensities using the trained model. The research results indicate that， based on the DRL method proposed in this paper， HALE-SUAVs can select reasonable control commands based on current solar radiation intensities to improve their endurance performance. The findings demonstrate the potential application value of DRL methods in HALE-SUAV trajectory planning problems.

Key words： deep reinforcement learning; HALE-SUAV; trajectory planning; endurance performance; energy management strategy

参考文献

[1]	CESTINO E. Design of solar high altitude long endurance aircraft for multi payload & operations?［J］. Aerospace Science and Technology， 2006， 10（6）： 541-550.
[2]	RAJENDRAN P， SMITH H. Implications of longitude and latitude on the size of solar-powered UAV［J］. Energy Conversion and Management， 2015， 98： 107-114.
[3]	高显忠，邓小龙，王玉杰，等. 临近空间太阳能飞机能量最优飞行航迹规划方法展望［J］. 航空学报， 2023， 44（8）： 027265.
	GAO X Z， DENG X L， WANG Y J， et al. General planning method for energy optimal flight path of solar-powered aircraft in near space?［J］. Acta Aeronautica et Astronautica Sinica， 2023， 44（8）： 027265 （in Chinese）.
[4]	姚远，戴雨可，徐一鸣. 考虑不确定性的太阳能无人机总体参数设计［J］. 航空学报， 2024， 45（17）： 529856.
	YAO Y， DAI Y K， XU Y M. Overall parameter design of solar UAV considering uncertainty［J］. Acta Aeronautica et Astronautica Sinica， 2024， 45（17）： 529856 （in Chinese）.
[5]	李广佳，王红波，张凯，等. 临近空间太阳能无人机增升减阻技术综述［J］. 航空学报， 2024， 45（5）： 529644.
	LI G J， WANG H B， ZHANG K， et al. Lift enhancement and drag reduction technologies of solar powered unmanned aerial vehicles in near space： Review［J］. Acta Aeronautica et Astronautica Sinica， 2024， 45（5）： 529644 （in Chinese）.
[6]	SHIAU J K， MA D M， CHIU C W， et al. Optimal sizing and cruise speed determination for a solar-powered airplane［J］. Journal of Aircraft， 2010， 47（2）： 622-629.
[7]	LI X H， SUN K J， LI F. General optimal design of solar-powered unmanned aerial vehicle for priority considering propulsion system?［J］. Chinese Journal of Aeronautics， 2020， 33（8）： 2176-2188.
[8]	KLESH A， KABAMBA P. Energy-optimal path planning for solar-powered aircraft in level flight［C］∥AIAA Guidance， Navigation and Control Conference and Exhibit. Reston： AIAA， 2007.
[9]	HUANG Y， CHEN J G， WANG H L， et al. A method of 3D path planning for solar-powered UAV with fixed target and solar tracking?［J］. Aerospace Science and Technology， 2019， 92： 831-838.
[10]	SPANGELO S C， GILBERT E G. Power optimization of solar-powered aircraft with specified closed ground tracks［J］. Journal of Aircraft， 2012， 50（1）： 232-238.
[11]	GAO X Z， HOU Z X， GUO Z， et al. Energy management strategy for solar-powered high-altitude long-endurance aircraft?［J］. Energy Conversion and Management， 2013， 70： 20-30.
[12]	LEE J S， YU K H. Optimal path planning of solar-powered UAV using gravitational potential energy?［J］. IEEE Transactions on Aerospace and Electronic Systems， 2017， 53（3）： 1442-1451.
[13]	SUN M， JI X Z， SUN K W， et al. Flight strategy optimization for high-altitude solar-powered aircraft based on gravity energy reserving and mission altitude［J］. Applied Sciences， 2020， 10（7）： 2243.
[14]	WANG X Y， YANG Y P， WU D， et al. Mission-oriented 3D path planning for high-altitude long-endurance solar-powered UAVs with optimal energy management?［J］. IEEE Access， 2020， 8： 227629-227641.
[15]	CAMCI E， KAYACAN E. End-to-end motion planning of quadrotors using deep reinforcement learning?［DB/OL］. arXiv preprint： 1909.13599， 2019.
[16]	GANDHI D， PINTO L， GUPTA A. Learning to fly by crashing［C］∥2017 IEEE/RSJ International Conference on Intelligent Robots and Systems （IROS）. Piscataway： IEEE Press， 2017.
[17]	NG A Y， COATES A， DIEL M， et al. Autonomous inverted helicopter flight via reinforcement learning［M］∥ Experimental Robotics IX. Berlin： Springer Berlin Heidelberg， 2006： 363-372.
[18]	CLARKE S G， HWANG I. Deep reinforcement learning control for aerobatic maneuvering of agile fixed-wing aircraft［C］∥AIAA Scitech 2020 Forum. Reston： AIAA， 2020.
[19]	REDDY G， WONG-NG J， CELANI A， et al. Glider soaring via reinforcement learning in the field?［J］. Nature， 2018， 562（7726）： 236-239.
[20]	BELLEMARE M G， CANDIDO S， CASTRO P S， et al. Autonomous navigation of stratospheric balloons using reinforcement learning?［J］. Nature， 2020， 588（7836）： 77-82.
[21]	NI W J， BI Y， WU D， et al. Energy-optimal trajectory planning for solar-powered aircraft using soft actor-critic［J］. Chinese Journal of Aeronautics， 2022， 35（10）： 337-353.
[22]	MARTIN R A， GATES N S， NING A， et al. Dynamic optimization of high-altitude solar aircraft trajectories under station-keeping constraints［J］. Journal of Guidance， Control， and Dynamics， 2018， 42（3）： 538-552.
[23]	邵嘉琪，张晓辉，席涵宇，等. 太阳能无人机线性自抗扰多环路能源控制［J］. 航空学报， 2023， 44（10）： 327812.
	SHAO J Q， ZHANG X H， XI H Y， et al. Multi-loop energy control method of linear active disturbance rejection for solar-powered UAVs［J］. Acta Aeronautica et Astronautica Sinica， 2023， 44（10）： 327812 （in Chinese）.
[24]	SUTTON R S， BARTO A G. Reinforcement learning： An introduction?［M］. 2nd ed. Cambridge： The MIT Press； 1998： 62-67.
[25]	HAARNOJA T， ZHOU A， ABBEEL P， et al. Soft actor-critic： Off-policy maximum entropy deep reinforcement learning with a stochastic actor［DB/OL］. arXiv preprint： 1801.01290，2018.
[26]	HAARNOJA T， ZHOU A， HARTIKAINEN K， et al. Soft actor-critic algorithms and applications［DB/OL］. arXiv preprint： 1812.05905， 2018.
[27]	SISSENWINE N， DUBIN M， WEXLER H. The U.S. standard atmosphere?［J］. Journal of Geophysical Research， 1962， 67（9）： 3627-3630.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献