航空学报 > 2025, Vol. 46 Issue (18): 331837-331837   doi: 10.7527/S1000-6893.2025.31837

考虑信道资源约束的多无人机航迹与通信策略协同规划

王辰1, 魏才盛1(), 殷泽阳1, 靳锴2, 李星辰3   

  1. 1.中南大学 自动化学院,长沙 410083
    2.中国电子科技集团第五十四研究所,石家庄 050081
    3.军事科学院 国防科技创新研究院,北京 100071
  • 收稿日期:2025-01-22 修回日期:2025-03-26 接受日期:2025-04-17 出版日期:2025-09-25 发布日期:2025-04-25
  • 通讯作者: 魏才盛 E-mail:caisheng_wei@csu.edu.cn
  • 基金资助:
    国家自然科学基金(62373379);湖南省自然科学基金(2024JJ6482);中南大学创新驱动项目(2023CXQD066)

Collaborative planning of multi-UAV trajectories and communication strategies considering channel resource constraints

Chen WANG1, Caisheng WEI1(), Zeyang YIN1, Kai JIN2, Xingchen LI3   

  1. 1.School of Automation,Central South University,Changsha 410083,China
    2.The 54th Research Institute of CETC,Shijiazhuang 050081,China
    3.National Innovation Institute of Defense Technology,Academy of Military Science,Beijing 100071,China
  • Received:2025-01-22 Revised:2025-03-26 Accepted:2025-04-17 Online:2025-09-25 Published:2025-04-25
  • Contact: Caisheng WEI E-mail:caisheng_wei@csu.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62373379);Hunan Provincial Natural Science Foundation(2024JJ6482);Central South University Innovation-Driven Research Program(2023CXQD066)

摘要:

针对多无人机协同侦察任务中飞行航迹与通信策略的优化问题,考虑飞行距离、通信能耗、信道容量等多元代价和基站信道资源约束、无人机性能约束、避碰约束等多重约束,提出了一种基于深度强化学习的协同规划方法。首先,面向随机未知侦察环境建立了多无人机航迹与通信策略协同规划模型。其次,提出了一种基于多智能体近端策略优化算法的端到端深度强化学习框架,以飞行距离、通信能耗、信道容量为多元优化目标,对无人机轨迹、通信连接策略、通信发射功率等耦合变量进行联合优化求解。为了降低多目标任务的学习和求解难度,基于人工势场法设计了一种包含基站虚拟引力的航迹规划子模型,通过强化学习自动参数寻优的方式,降低决策空间大小、加快模型收敛速度。最后,通过仿真实验验证了所提方法在优化任务总成本指标上的优势。

关键词: 无人机, 航迹规划, 基站信道资源约束, 深度强化学习, 协同优化

Abstract:

To address the optimization problem of flight trajectories and communication strategies in multi-UAV collaborative reconnaissance missions, this study proposes a collaborative planning approach based on deep reinforcement learning, considering diverse costs such as flight distance, communication energy consumption, and channel capacity, along with multiple constraints including base station channel resource constraints, UAV performance constraints, and collision avoidance constraints. Firstly, a collaborative planning model for multi-UAV trajectories and communication strategies is established for randomly unknown reconnaissance environments. Secondly, an end-to-end deep reinforcement learning framework based on the multi-agent proximal policy optimization algorithm is introduced to jointly optimize coupled variables such as UAV trajectories, communication connection strategies, and communication transmit power, with flight distance, communication energy consumption, and channel capacity serving as multiple optimization objectives. To reduce the complexity of learning and solving multi-objective tasks, a trajectory planning sub-model incorporating virtual gravity from base stations is designed, which decreases the decision space. A trajectory planning sub-model that incorporates the virtual gravitational force of base stations is designed based on the artificial potential field method. Through the approach of automatically optimizing parameters via reinforcement learning, the size of the decision space is reduced, and the convergence speed of the model is accelerated. Finally, simulation experiments demonstrate the advantages of the proposed method in optimizing the overall mission cost index.

Key words: UVA, trajectory planning, base station channel resource constraint, deep reinforcement learning, collaborative optimization

中图分类号: