高空长航时太阳能无人机(HALE-SUAV)通过合理的航迹规划可以极大提升其续航性能,而深度强化学习方法由于实时性与自适应性成为该航迹规划问题的理想选择。针对基于深度强化学习方法的HALE-SUAV航迹规划问题,本文建立了无人机的运动学与动力学模型以及能量相关模型,设计了其能量管理策略,搭建了该航迹规划问题的深度强化学习整体框架,并最终使用训练出来的模型进行了不同太阳能辐射强度情况下的航迹规划实验。研究结果表示基于本文的深度强化学习方法,HALE-SUAV能够选择基于当前太阳能辐射强度情况下合理的控制指令,以提高其续航性能。研究结果显示了深度强化学习方法在HALE-SUAV航迹规划问题的潜在应用价值。
High Altitude Long Endurance Solar-Powered Unmanned Aerial Vehicles (HALE-SUAVs) can significantly enhance their endurance performance through well-designed trajectory planning. Deep Reinforcement Learning (DRL) methods are ideal for this trajectory planning problem due to their real-time performance and adaptability. Addressing the HALE-SUAV trajectory planning problem based on DRL, this paper establishes the kinematics and dynamics models of the UAV, along with energy-related models, designs its energy management strategy, constructs the overall DRL framework for this trajectory planning problem, and ultimately conducts trajectory planning experiments under different solar radiation intensities using the trained model. The research results indicate that, based on the DRL method pro-posed in this paper, HALE-SUAVs can select reasonable control commands based on current solar radiation intensi-ties to improve their endurance performance. The findings demonstrate the potential application value of DRL meth-ods in HALE-SUAV trajectory planning problems.
[1] Cestino E. Design of solar high altitude long endur-ance aircraft for multi payload & operations [J]. Aer-osp Sci Technol 2006;10 (6):541–50.
[2] Rajendran P, Smith H. Implications of longitude and latitude on the size of solar-powered UAV [J]. Energy Convers Manag 2015; 98:107–14.
[3] 高显忠, 邓小龙, 王玉杰, 等. 临近空间太阳能飞机能量最优飞行航迹规划方法展望[J].航空学报,2023,44(8): 027265.
GAO X Z,DENG X L,WANG Y J,et al. General planning method for energy optimal flight path of solar-powered aircraft in near space [J]. Acta Aeronautica et Astronautica Sinica,2023,44(8):027265(in Chinese).
[4] Shiau JK, Ma DM, Chiu CW, et al. Optimal sizing and cruise speed determination for a solar-powered airplane [J]. J Aircr 2010;47 (2):622–9.
[5] Li X, Sun K, Li F. General optimal design of solar-powered unmanned aerial vehicle for priority consid-ering propulsion system[J]. Chin J Aeronaut 2020; 33(8): 2176–88.
[6] Klesh A, Kabamba P. Energy-optimal path planning for solarpowered aircraft in level flight [C]. AIAA guidance, navigation and control conference; 2007.
[7] Y Huang, J Chen, H Wang, and G Su, A method of 3D path planning for solar-powered UAV with fixed target and solar tracking [J], Aerosp. Sci. Technol., vol. 92, pp. 831–838, Sep. 2019.
[8] Spangelo SC, Gilbert E G. Power optimization of solar-powered aircraft with specified closed ground tracks [J]. J Aircr 2012;50 (1): 232–8.
[9] Gao XZ, Hou ZX, Guo Z, et al. Energy management strategy for solar-powered high-altitude long-endurance aircraft [J]. Energy Convers Manag 2013; 70:20–30.
[10] J-S Lee and K-H,Yu, Optimal path planning of so-lar-powered UAV using gravitational potential energy [J], IEEE Trans. Aerosp. Electron. Syst., vol. 53, no. 3, pp. 1442–1451, Jun. 2017.
[11] M Sun, X Ji, K Sun, and M Zhu, Flight strategy op-timization for high-altitude solar-powered aircraft based on gravity energy reserving and mission alti-tude [J], Appl. Sci., vol. 10, no. 7, p. 2243, 2020, doi: 10.3390 /app10072243.
[12] Wang X, Yang Y, Wu D, et al. Mission-oriented 3D path planning for high-altitude long-endurance solar-powered UAVs with optimal energy management [J]. IEEE Access 2020; 8: 227629–41.
[13] Camci E, Kayacan E. End-to-end motion planning of quadrotors using deep reinforcement learning [J]. arXiv preprint, arXiv:1909.13599,2019.
[14] Gandhi D, Pinto L, Gupta A. Learning to fly by crashing. 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS) [C] 2017 Sep-tember 24-28; Vancouver, USA. Piscataway: IEEE Press; 2017.
[15] Ng AY, Coates A, Diel M, et al. Autonomous inverted helicopter flight via reinforcement learning [C]. Springer tracts in advanced robotics. Berlin, Heidel-berg: Springer Berlin Heidelberg, 2006. p.363–372.
[16] Clarke SG, Hwang I. Deep reinforcement learning control for aerobatic maneuvering of agile fixed-wing aircraft [C]. AIAA scitech 2020 forum. 2020 January 6-10; Orlando, USA. Reston: AIAA; 2020.
[17] Reddy G, Wong-Ng J, Celani A, et al. Glider soaring via reinforcement learning in the field [J]. Nature 2018; 562(7726): 236–9.
[18] Bellemare MG, Candido S, Castro PS, et al. Autono-mous navigation of stratospheric balloons using rein-forcement learning [J]. Nature 2020; 588(7836): 77–82.
[19] W Ni, Y Bi, D Wu, and X Ma, Energy-optimal trajec-tory planning for solar-powered aircraft using soft ac-tor-critic [J], Chin J Aeronaut. 35(10): 337–353,
[20] Sissenwine N, Dubin M, Wexler H. The US standard atmosphere[R], 1962. J Geophys Res 1962;67(9):3627–30.
[21] Sutton RS, Barto AG. Reinforcement learning: an introduction [C]. 2nd ed. Cambridge, MA: The MIT Press; 1998. p. 62–7.
[22] Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor [C]. Proceedings of the 35th international conference on machine learn-ing; 2018.
[23] Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic algorithms and applications [C]. arXiv preprint, arXiv:1812.05905, 2018.
[24] Martin, R A, Gates, N S., Ning, A, et al. Dynamic Optimization of High-Altitude Solar Aircraft Trajecto-ries Under Station-Keeping Constraints[J]. Journal of Guidance, Control, and Dynamics 2019, 42(3).