为增强空天飞行器再入过程的快速响应能力、任务适应性和对显著模型偏差的鲁棒性,提出了基于规划-校正分层强化学习的自主再入制导方法。针对传统分层强化学习的训练不平稳性问题,为消除上层策略训练对下层状态转移数据的依赖,提出规划-校正分层策略,建立双层制导框架。在规划层,采用模块化强化学习策略规划参考攻角剖面和倾侧角剖面,根据任务需求实现全局轨迹生成,确保制导框架的任务适应能力;在校正层,通过模型参数偏差下的高频轨迹校正,克服参数大偏差影响。仿真结果表明,双层制导策略能够克服更大的参数偏差,提升在大偏差情况下的制导精度。同时,与预测校正制导算法比较,双层制导策略展现了更强的任务适应性和实时性,可实现任意位置与方向的任务下自主制导。
To enhance the rapid response capability, mission adaptability, and robustness against significant model deviations during aerospace vehicle reentry, this study proposes an autonomous reentry guidance method based on planning-correction hierarchical reinforcement learning (HRL). Addressing the training instability issues in traditional HRL, a planning-correction hierarchical strategy is introduced to eliminate the dependence of upper-level policy training on lower-level state transition data, establishing a dual-layer guidance framework. In the planning layer, a modular RL policy is employed to plan reference angle-of-attack and bank angle profiles, generating global trajectories according to mission requirements to ensure the framework's adaptability. In the correction layer, high-frequency trajectory corrections under model parameter deviations are performed to mitigate the impact of large parameter deviations. Simulation results demonstrate that the dual-layer guidance strategy can handle larger parameter deviations and improve guidance accuracy under significant uncertainties. Compared to the predictor-corrector guidance algorithm, the proposed strategy exhibits superior mission adaptability and real-time performance, enabling autonomous guidance from arbitrary initial positions and orientations.