基于深度强化学习调参的制导控制一体化方法
收稿日期: 2025-06-03
修回日期: 2025-06-04
录用日期: 2025-06-05
网络出版日期: 2025-06-20
基金资助
国家自然科学基金大飞机基础研究联合基金重点支持项目(U2570207);国家优秀青年科学基金(62222317);湖南省科技重大专项(2021GK1030);湖南省重点研发计划(2023GK2023)
Integrated guidance and control method based on deep reinforcement learning parameter tuning
Received date: 2025-06-03
Revised date: 2025-06-04
Accepted date: 2025-06-05
Online published: 2025-06-20
Supported by
Key Program of the Joint Fund for Basic Research on Large Aircraft of the National Natural Science Foundation of China(U2570207);National Science Fund for Excellent Young Scholars(62222317);Hunan Provincial Key Technology Innovation Program(2021GK1030);Key Research and Development Program of Hunan Province(2023GK2023)
谢启超 , 曹承钰 , 赵逸云 , 李繁飙 . 基于深度强化学习调参的制导控制一体化方法[J]. 航空学报, 2025 , 46(24) : 632345 -632345 . DOI: 10.7527/S1000-6893.2025.32345
To address the dynamic optimization problem of guidance and control parameters of hypersonic flight vehicles, a Deep reinforcement learning parameter tuning method based on the Twin Delayed Deep Deterministic policy gradient (TD3) algorithm is proposed. Firstly, the motion model of the hypersonic flight vehicle and the integrated model of guidance and control were established, and the controller based on the backstepping method was designed. The consistent final boundedness was proved via Lyapunov stability. Then, the controller parameter optimization problem was transformed into a Markov decision process model, and the data-driven online adaptive optimization of controller parameters was achieved based on the TD3 algorithm. This method constructs a parameter optimization mechanism that integrates the prior knowledge of the model and data-driven approaches, significantly enhancing the autonomous adaptability of the controller in the parameter space. Finally, the effectiveness and robustness of the proposed method were verified through numerical simulation.
| [1] | 黄伟, 罗世彬, 王振国. 临近空间高超声速飞行器关键技术及展望[J]. 宇航学报, 2010, 31(5): 1259-1265. |
| HUANG W, LUO S B, WANG Z G. Key techniques and prospect of near-space hypersonic vehicle[J]. Journal of Astronautics, 2010, 31(5): 1259-1265 (in Chinese). | |
| [2] | 吴宏鑫, 孟斌. 高超声速飞行器控制研究综述[J]. 力学进展, 2009, 39(6): 756-765. |
| WU H X, MENG B. Review on the control of hypersonic flight vehicles[J]. Advances in Mechanics, 2009, 39(6): 756-765 (in Chinese). | |
| [3] | 张超凡, 宗群, 董琦, 等. 高超声速飞行器模型及控制若干问题综述[J]. 信息与控制, 2017, 46(1): 90-102. |
| ZHANG C F, ZONG Q, DONG Q, et al. A survey of models and control problems of hypersonic vehicles[J]. Information and Control, 2017, 46(1): 90-102 (in Chinese). | |
| [4] | 孙长银, 穆朝絮, 余瑶. 近空间高超声速飞行器控制的几个科学问题研究[J]. 自动化学报, 2013, 39(11): 1901-1913. |
| SUN C Y, MU C X, YU Y. Some control problems for near space hypersonic vehicles[J]. Acta Automatica Sinica, 2013, 39(11): 1901-1913 (in Chinese). | |
| [5] | 穆凌霞, 王新民, 谢蓉, 等. 高超音速飞行器及其制导控制技术综述[J]. 哈尔滨工业大学学报, 2019, 51(3): 1-14. |
| MU L X, WANG X M, XIE R, et al. A survey of the hypersonic flight vehicle and its guidance and control technology[J]. Journal of Harbin Institute of Technology, 2019, 51(3): 1-14 (in Chinese). | |
| [6] | LIANG Z X, LV C, ZHU S Y. Lateral entry guidance with terminal time constraint[J]. IEEE Transactions on Aerospace and Electronic Systems, 2023, 59(3): 2544-2553. |
| [7] | ZHANG F, DUAN G R. Coupled dynamics and integrated control for position and attitude motions of spacecraft: A survey[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(12): 2187-2208. |
| [8] | LI Z B, ZHANG X Y, ZHANG H R, et al. Three-dimensional approximate cooperative integrated guidance and control with fixed-impact time and azimuth constraints[J]. Aerospace Science and Technology, 2023, 142: 108617. |
| [9] | ZHAO Q, DUAN G R. Exponential position and attitude tracking control of spacecraft with unbiased parameter identification[J]. IEEE Transactions on Aerospace and Electronic Systems, 2024, 60(1): 1113-1128. |
| [10] | ZHOU M, LU M F, HU G J, et al. Koopman operator-based integrated guidance and control for strap-down high-speed missiles[J]. IEEE Transactions on Control Systems Technology, 2024, 32(6): 2436-2443. |
| [11] | 王肖, 郭杰, 唐胜景, 等. 吸气式高超声速飞行器鲁棒非奇异Terminal滑模反步控制[J]. 航空学报, 2017, 38(3): 320287. |
| WANG X, GUO J, TANG S J, et al. Robust nonsingular Terminal sliding mode backstepping control for air-breathing hypersonic vehicles[J]. Acta Aeronautica et Astronautica Sinica, 2017, 38(3): 320287 (in Chinese). | |
| [12] | 李亚苹, 王芳, 周超. 全状态受限的高超声速飞行器的预定性能滤波反步控制[J]. 航空学报, 2020, 41(11): 623857. |
| LI Y P, WANG F, ZHOU C. Prescribed performance filter backstepping control of hypersonic vehicle with full state constraints[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(11): 623857 (in Chinese). | |
| [13] | 周觐, 雷虎民, 李炯, 等. 基于神经网络的导弹制导控制一体化反演设计[J]. 航空学报, 2015, 36(5): 1661-1672. |
| ZHOU J, LEI H M, LI J, et al. Integrated missile guidance and control design based on neural network and back-stepping control theory[J]. Acta Aeronautica et Astronautica Sinica, 2015, 36(5): 1661-1672 (in Chinese). | |
| [14] | 王伟, 张晶涛, 柴天佑. PID参数先进整定方法综述[J]. 自动化学报, 2000, 26(3): 347-355. |
| WANG W, ZHANG J T, CHAI T Y. A survey of advanced pid parameter tuning methods[J]. Acta Automatica Sinica, 2000, 26(3): 347-355 (in Chinese). | |
| [15] | 余胜威, 曹中清. 基于人群搜索算法的PID控制器参数优化[J]. 计算机仿真, 2014, 31(9): 347-350, 373. |
| YU S W, CAO Z Q. Optimization parameters of PID controller parameters based on seeker optimization algorithm[J]. Computer Simulation, 2014, 31(9): 347-350, 373 (in Chinese). | |
| [16] | 杨侃, 王昭磊, 强艳辉, 等. 一种面向变体飞行器的控制器设计方法[J]. 航天控制, 2024, 42(3): 3-8. |
| YANG K, WANG Z L, QIANG Y H, et al. A controller design method oriented to variant vehicles[J]. Aerospace Control, 2024, 42(3): 3-8 (in Chinese). | |
| [17] | 康朝海, 王博宇, 杨永英. 基于精英高斯学习的改进鱼群粒子群混合算法[J]. 吉林大学学报(信息科学版), 2018, 36(4): 430-438. |
| KANG C H, WANG B Y, YANG Y Y. Improved hybrid algorithm with fish swarm-particle swarm optimization based on elite Gaussian learning[J]. Journal of Jilin University (Information Science Edition), 2018, 36(4): 430-438 (in Chinese). | |
| [18] | 李墨吟, 马泽远, 周建平, 等. 基于神经网络的变后掠翼飞行器自适应控制方法研究[J]. 弹箭与制导学报, 2021, 41(5): 73-77, 85. |
| LI M Y, MA Z Y, ZHOU J P, et al. Research on adaptive control method of variable-sweep wing aircraft based on neural network[J]. Journal of Projectiles, Rockets, Missiles and Guidance, 2021, 41(5): 73-77, 85 (in Chinese). | |
| [19] | ARULKUMARAN K, DEISENROTH M P, BRUNDAGE M, et al. Deep reinforcement learning: A brief survey[J]. IEEE Signal Processing Magazine, 2017, 34(6): 26-38. |
| [20] | 王建华, 刘鲁华, 王鹏, 等. 高超声速飞行器俯冲段制导控制一体化设计方法[J]. 航空学报, 2017, 38(3): 320328. |
| WANG J H, LIU L H, WANG P, et al. Integrated guidance and control scheme for hypersonic vehicles in dive phase[J]. Acta Aeronautica et Astronautica Sinica, 2017, 38(3): 320328 (in Chinese). | |
| [21] | 李惠峰, 肖进, 林平. 基于参数化外形的通用大气飞行器建模与分析[J]. 宇航学报, 2011, 32(11): 2305-2311. |
| LI H F, XIAO J, LIN P. Modeling and analyzing of common aero vehicle with parametric configuration[J]. Journal of Astronautics, 2011, 32(11): 2305-2311 (in Chinese). | |
| [22] | BU X W, WU X Y, HUANG J Q, et al. A guaranteed transient performance-based adaptive neural control scheme with low-complexity computation for flexible air-breathing hypersonic vehicles[J]. Nonlinear Dynamics, 2016, 84(4): 2175-2194. |
| [23] | 李小华, 徐波, 刘洋. 非线性扩展结构大系统自适应神经网络跟踪控制[J]. 控制与决策, 2016, 31(10): 1860-1866. |
| LI X H, XU B, LIU Y. Adaptive neural network tracking control for a class of nonlinear largescale systems with expanding construction[J]. Control and Decision, 2016, 31(10): 1860-1866 (in Chinese). | |
| [24] | 何昊, 王鹏. 高速变形飞行器制导控制一体化设计方法[J]. 航空学报, 2024, 45(S1):730692. |
| HE H, WANG P. Integrated guidance and control method for high-speed morphing wing aircraft[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(S1):730692 (in Chinese). | |
| [25] | CAO C Y, LI F B, DING R, et al. Intelligent attitude control for morphing flight vehicle: a deep reinforcement learning approach[J]. IEEE Transactions on Vehicular Technology, 2025, 74(6): 8851-8865. |
| [26] | CAO C Y, LI F B, XIE Q C, et al. Integrated guidance and control of morphing flight vehicle via sliding-mode-based robust reinforcement learning[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2025, 55(5): 3350-3362. |
/
| 〈 |
|
〉 |