基于规划-校正分层强化学习的自主再入制导（航天运输系统自主制导与控制技术专栏）

doi:10.7527/S1000-6893.2025.32485

本期目录 | 过刊浏览 | 高级检索

前一篇 | 后一篇

基于规划-校正分层强化学习的自主再入制导（航天运输系统自主制导与控制技术专栏）

彭高祥,王博,刘磊,樊慧津

华中科技大学

收稿日期:2025-06-27 修回日期:2025-11-09 出版日期:2025-11-10 发布日期:2025-11-10
通讯作者: 王博
基金资助:
武汉市知识创新专项基础研究项目;国家自然科学基金

Autonomous reentry guidance based on planning-correction hierarchical reinforcement learning

Gaoxiang Peng¹,Bo Wang²,Lei Liu², ²

1. Huazhong University of Science and Technology
2.

Received:2025-06-27 Revised:2025-11-09 Online:2025-11-10 Published:2025-11-10
Contact: Bo Wang

摘要/Abstract

摘要： 为增强空天飞行器再入过程的快速响应能力、任务适应性和对显著模型偏差的鲁棒性，提出了基于规划-校正分层强化学习的自主再入制导方法。针对传统分层强化学习的训练不平稳性问题，为消除上层策略训练对下层状态转移数据的依赖，提出规划-校正分层策略，建立双层制导框架。在规划层，采用模块化强化学习策略规划参考攻角剖面和倾侧角剖面，根据任务需求实现全局轨迹生成，确保制导框架的任务适应能力；在校正层，通过模型参数偏差下的高频轨迹校正，克服参数大偏差影响。仿真结果表明，双层制导策略能够克服更大的参数偏差，提升在大偏差情况下的制导精度。同时，与预测校正制导算法比较，双层制导策略展现了更强的任务适应性和实时性，可实现任意位置与方向的任务下自主制导。

关键词: 空天飞行器, 再入段, 自主制导, 分层强化学习, 规划-校正分层

Abstract: To enhance the rapid response capability, mission adaptability, and robustness against significant model deviations during aerospace vehicle reentry, this study proposes an autonomous reentry guidance method based on planning-correction hierarchical reinforcement learning (HRL). Addressing the training instability issues in traditional HRL, a planning-correction hierarchical strategy is introduced to eliminate the dependence of upper-level policy training on lower-level state transition data, establishing a dual-layer guidance framework. In the planning layer, a modular RL policy is employed to plan reference angle-of-attack and bank angle profiles, generating global trajectories according to mission requirements to ensure the framework's adaptability. In the correction layer, high-frequency trajectory corrections under model parameter deviations are performed to mitigate the impact of large parameter deviations. Simulation results demonstrate that the dual-layer guidance strategy can handle larger parameter deviations and improve guidance accuracy under significant uncertainties. Compared to the predictor-corrector guidance algorithm, the proposed strategy exhibits superior mission adaptability and real-time performance, enabling autonomous guidance from arbitrary initial positions and orientations.

Key words: Aerospace vehicle, Reentry phase, Autonomous guidance, Hierarchical reinforcement learning, Planning-correction hierarchy

中图分类号:

V448.235

彭高祥王博刘磊樊慧津. 基于规划-校正分层强化学习的自主再入制导（航天运输系统自主制导与控制技术专栏）[J]. 航空学报, doi: 10.7527/S1000-6893.2025.32485.

Gaoxiang Peng Bo Wang Lei Liu. Autonomous reentry guidance based on planning-correction hierarchical reinforcement learning[J]. Acta Aeronautica et Astronautica Sinica, doi: 10.7527/S1000-6893.2025.32485.

E-mail：hkxb@buaa.edu.cn

关于我们

期刊社服务

专业学科

封面文章

友情链接

主管单位：中国科学技术协会主办单位：中国航空学会北京航空航天大学

[1]	姜宗林, 韩桂来, 汪运鹏, 刘云峰, 苑朝凯, 罗长童, 王春, 胡宗民, 刘美宽. JF-22超高速风洞理论基础与关键技术[J]. 航空学报, 2025, 46(5): 531130-531130.
[2]	杨利鑫, 李彦斌, 费庆国. 空天飞行器电磁功能结构研究进展及展望[J]. 航空学报, 2025, 46(18): 331808-331808.
[3]	罗祎喆, 张辉, 余新得, 金钊, 冯朔, 石育澄, 徐明亮. 面向舰载机多波次弹药保障任务的分层动态调度[J]. 航空学报, 2025, 46(18): 331945-331945.
[4]	罗星东, 侯自豪, 吴可鸣, 申振, 张珅榕. Stargazer空天飞行器电磁助推分离安全性分析[J]. 航空学报, 2024, 45(24): 630481-630481.
[5]	李少伟, 宁昕, 罗星东, 侯自豪, 薄靖龙. 超声速电磁发射近地多体分离气动干扰特性[J]. 航空学报, 2024, 45(18): 129884-129884.
[6]	秦飞, 赵征, 何国强, 景婷婷, 孙星, 魏祥庚. 火箭基组合循环发动机热结构技术研究进展[J]. 航空学报, 2024, 45(11): 529572-529572.
[7]	王子运, 于航, 张悦, 谭慧俊, 金毅, 李鑫. 空天飞行器可调进气系统关键问题研究进展[J]. 航空学报, 2024, 45(11): 529440-529440.
[8]	姜松成, 杨慧, 王岩, 肖洪, 刘永斌, 李传扬. 变形翼可调泊松比柔性蒙皮力学特性分析[J]. 航空学报, 2023, 44(13): 227748-227748.
[9]	尤志鹏, 杨勇, 刘刚, 曹晓瑞, 郑宏涛. 基于Kalman滤波的空天飞行器再入制导算法[J]. 航空学报, 2021, 42(11): 524608-524608.
[10]	王嘉炜, 张冉, 郝泽明, 李惠峰. 基于Proximal-Newton-Kantorovich凸规划的空天飞行器实时轨迹优化[J]. 航空学报, 2020, 41(11): 624051-624051.
[11]	董旺, 齐瑞云, 姜斌. 空天飞行器直接力/气动力复合容错控制[J]. 航空学报, 2020, 41(11): 623850-623850.
[12]	方群, 刘怡思, 王雪峰. 空天飞行器弹道/轨道一体化设计[J]. 航空学报, 2018, 39(4): 121398-121398.
[13]	鹿存侃, 胡永太. 气动舵面/RCS复合控制系统构型设计与仿真[J]. 航空学报, 2016, 37(S1): 106-111.
[14]	方炜;姜长生. 基于自适应模糊系统的空天飞行器非线性预测控制[J]. 航空学报, 2008, 29(4): 988-994.
[15]	尚海滨;崔平远;栾恩杰. 近地小推力转移轨道的加权组合制导策略[J]. 航空学报, 2007, 28(6): 1419-1427.

基于规划-校正分层强化学习的自主再入制导（航天运输系统自主制导与控制技术专栏）

Autonomous reentry guidance based on planning-correction hierarchical reinforcement learning

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价