首页 >

考虑着陆机会约束的随机最优控制动力下降制导(航天运输系统自主制导与控制技术专栏)

何林坤1,张冉2,李惠峰2,包为民3   

  1. 1. 北京航空航天大学宇航学院
    2. 北京航空航天大学
    3. 中国航天科技集团有限公司
  • 收稿日期:2026-01-07 修回日期:2026-06-24 发布日期:2026-06-26
  • 通讯作者: 张冉
  • 基金资助:
    国家自然科学基金;国家自然科学基金;北京市自然科学基金

Stochastic optimal control based powered descent guidance with landing chance constraints

Lin-Kun HERan ZhangHui-Feng LI2,   

  • Received:2026-01-07 Revised:2026-06-24 Published:2026-06-26
  • Contact: Ran Zhang

摘要: 随机最优控制动力下降制导将含有不确定性的动力下降制导问题转化为随机最优控制问题,是提升重复使用火箭可靠性的重要技术途径。然而,考虑大气层内动力下降面临的大范围初始状态与不确定性分布,现有随机最优控制动力下降制导方法无法处理终端状态的长尾分布,难以对小概率但致命的极端终端状态误差进行有效调节。为此,本文研究考虑着陆机会约束的随机最优控制动力下降制导问题,直接将有关终端状态分布尾部分位点的阈值约束加入到最优控制问题中。针对新加入的着陆机会约束给问题求解带来的困难,本文设计了一种基于神经网络参数整定的制导策略:1)采用改进的物理信息神经网络结构,通过引入控制饱和缓解和网络输入增广实现对初始状态与不确定性分布参数的适应,解决了着陆机会约束对初始状态和不确定性分布参数的高敏感性难题;2)构建终端状态分布的混沌多项式代理模型,利用少量采样轨迹对着陆机会约束分位点进行高精度估计,解决了着陆机会约束的低评估效率难题;3)采用得到的代理模型采样结果构建强化学习训练框架,实现无需梯度的制导策略优化,解决了着陆机会约束对制导策略不可微的难题。数值仿真结果表明,相比现有随机最优控制动力下降制导方法,本文所提方法能有效改善终端状态分布的长尾特性,显著降低小概率着陆失败事件的发生概率。

关键词: 动力下降制导, 机会约束, 物理信息神经网络, 混沌多项式展开, 强化学习

Abstract: Stochastic optimal control based powered descent guidance (SOC-PDG) transforms the powered descent guidance problem with uncertainties into a stochastic optimal control framework, representing a key technology for enhancing the reliability of reusable rockets. However, under a wide range of initial state and uncertainty distribution combinations, existing SOC-PDG methods based on mean-covariance constraint descriptions are unable to handle potentially long-tailed terminal state distributions, making it difficult to effectively manage low-probability but catastrophic large-magnitude terminal errors. To address this, this paper investigates the SOC-PDG problem with landing chance constraints (SOC-PDG-LCC), which directly introduce landing chance constraints related to the quantiles of the terminal state distribution. To tackle the high sensitivity, low evaluation efficiency, and non-differentiability of the landing chance constraints, this paper designs a guidance policy with the following key features: 1) An improved neural network-based parametric guidance architecture is employed to maintain consistent landing performance across different initial state and uncertainty distribution parameters. 2) A guidance policy evaluation method based on the polynomial chaos expansion surrogate model is proposed to enable efficient estimation of the mean propellant consumption and the quantiles for landing chance constraints during training. 3) A reinforcement learning training method for the guidance policy is developed using samples from the surrogate model, achieving gradient-free optimization of the guidance policy. Simulation results demonstrate that, compared to existing SOC-PDG methods based on mean-covariance constraint descriptions, the proposed method effectively mitigates the long-tail characteristics of the terminal state distribution across a wide range of initial states and uncertainty parameters, significantly reducing the probability of low-probability landing failure events.

Key words: powered descent guidance, chance constraint, physics informed neural network, polynomial chaos expansion, reinforcement learning