基于SAC的低轨卫星上行鲁棒传输优化方法

刘畅; 马彪; 杨柳; 许拔; 欧阳键

doi:10.7527/S1000-6893.2026.33400

航空学报 >

0 1 - 0

DOI: https://doi.org/10.7527/S1000-6893.2026.33400

基于SAC的低轨卫星上行鲁棒传输优化方法

刘畅 ,
马彪 ,
杨柳 ,
许拔 ,
欧阳键

展开

1. 南京邮电大学
2. 航天工程大学电子与光学工程系
3. 国防科技大学

收稿日期: 2026-01-20

修回日期: 2026-05-03

网络出版日期: 2026-05-08

基金资助

智能化航天测运控教育部重点实验室基金资助项目;南京信息工程大学复杂环境智能保障技术教育部重点实验室开放基金资助课题

收起

SAC-Based robust uplink transmission optimization method for LEO satellites

LIU Chang ,
MA Biao ,
YANG Liu ,
XU Ba ,
XU Ba Yang-Jian

Expand

Received date: 2026-01-20

Revised date: 2026-05-03

Online published: 2026-05-08

Fold

摘要

针对低地球轨道（LEO）卫星上行通信环境高动态导致信道误差统计先验信息获取难度大的问题，提出一种上行鲁棒传输优化方法。该方法以总发射功率最小化为目标，以用户服务质量（QoS）和最大发射功率为约束，构建用户角度信息不准确场景下的上行鲁棒传输优化问题。通过将此非凸问题建模为马尔可夫决策过程，设计一种基于柔性演员-评论家（SAC）的上行多用户鲁棒波束成形（BF）与功率控制算法，使卫星无需依赖误差统计先验信息，通过与环境交互自主学习并调整波束成形权矢量与功率控制策略。仿真结果表明，相比于非鲁棒方案和近端策略优化基准方案，所提方法在不同误差场景下所需发射功率能分别平均节省59%和24%，表明所提方法的鲁棒性和优越性。

关键词： 低地球轨道卫星; 上行链路传输; 非完美信道状态信息; 鲁棒波束成形; 功率控制; 深度强化学习

本文引用格式

刘畅 , 马彪 , 杨柳 , 许拔 , 欧阳键 . 基于SAC的低轨卫星上行鲁棒传输优化方法[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2026.33400

Abstract

To address the challenge of obtaining prior statistical information on channel errors caused by the highly dynamic environment of Low Earth Orbit (LEO) satellite uplink communications, this paper proposes a robust uplink transmission optimization method. Aiming to minimize the total transmit power subject to User Quality of Service (QoS) and maximum transmit power constraints, a robust transmission optimization problem is formulated for scenarios characterized by inaccurate user angle information. By modeling this non-convex problem as a Markov Decision Process (MDP), a multi-user robust beamforming (BF) and power control algorithm based on Soft Actor-Critic (SAC) is designed. This approach enables the satellite to autonomously learn and adjust beamforming weight vectors and power control strategies through interaction with the environment, without relying on prior information regarding error statistics. Simulation results demonstrate that, compared with the non-robust scheme and the Proximal Policy Optimization (PPO) baseline scheme, the proposed method reduces the required transmit power by an average of 59% and 24%, respectively, under different error scenarios, thereby verifying the robustness and superiority of the proposed method.

Key words： Low Earth Orbit satellite; uplink transmission; imperfect channel state information; robust beamforming; power control; deep reinforcement learning

Options

文章导航

地址：北京市海淀区北四环中路辅路238号柏彦大厦

邮政编码：100083

E-mail：hkxb@buaa.edu.cn

关于我们

期刊社服务

专业学科

封面文章

友情链接

主管单位：中国科学技术协会主办单位：中国航空学会北京航空航天大学

模态框（Modal）标题

摘要

本文引用格式

Abstract