考虑信道资源约束的多无人机航迹与通信策略协同规划

  • 王辰 ,
  • 魏才盛 ,
  • 殷泽阳 ,
  • 靳锴 ,
  • 李星辰
展开
  • 1. 中南大学
    2. 中南大学自动化学院
    3. 中国电子科技集团第五十四研究所
    4. 军事科学院国防科技创新研究院

收稿日期: 2025-01-22

  修回日期: 2025-04-18

  网络出版日期: 2025-04-25

基金资助

国家自然科学基金;湖南省自然科学基金;中南大学创新驱动项目

Collaborative planning of Multi-UAV trajectories and communication strategies considering channel resource constraints

  • WANG Chen ,
  • WEI Cai-Sheng ,
  • YIN Ze-Yang ,
  • JIN Kai ,
  • LI Xing-Chen
Expand

Received date: 2025-01-22

  Revised date: 2025-04-18

  Online published: 2025-04-25

摘要

针对多无人机协同侦察任务中飞行航迹与通信策略的优化问题,考虑飞行距离、通信能耗、信道容量等多元代价和基站信道资源约束、无人机性能约束、避碰约束等多重约束,提出了一种基于深度强化学习的协同规划方法。首先,面向随机未知侦察环境建立了多无人机航迹与通信策略协同规划模型;其次,提出了一种基于多智能体近端策略优化算法的端到端深度强化学习框架,以飞行距离、通信能耗、信道容量为多元优化目标,对无人机轨迹、通信连接策略、通信发射功率等耦合变量进行联合优化求解;其中,为降低多目标任务的学习和求解难度,设计了一种包含基站虚拟引力的航迹规划子模型,降低了决策空间大小。最后,通过仿真实验验证了所提方法在优化任务总成本指标上的优势。

本文引用格式

王辰 , 魏才盛 , 殷泽阳 , 靳锴 , 李星辰 . 考虑信道资源约束的多无人机航迹与通信策略协同规划[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2025.31837

Abstract

Addressing the optimization problem of flight trajectories and communication strategies in multi-UAV collaborative reconnais-sance missions, this study proposes a collaborative planning approach based on deep reinforcement learning, considering diverse costs such as flight distance, communication energy consumption, and channel capacity, along with multiple constraints including base station channel resource constraints, UAV performance constraints, and collision avoidance constraints. Firstly, a collabora-tive planning model for multi-UAV trajectories and communication strategies is established for randomly unknown reconnaissance environments. Secondly, an end-to-end deep reinforcement learning framework based on the multi-agent proximal policy optimi-zation algorithm is introduced to jointly optimize coupled variables such as UAV trajectories, communication connection strategies, and communication transmit power, with flight distance, communication energy consumption, and channel capacity serving as multiple optimization objectives. To reduce the complexity of learning and solving multi-objective tasks, a trajectory planning sub-model incorporating virtual gravity from base stations is designed, which decreases the decision space. Finally, simulation experi-ments demonstrate the advantages of the proposed method in optimizing the overall mission cost index.

参考文献

[1]PENG G, XIA Y, ZHANG X, et al.UAV-aided networks for emergency communications in areas with unevenly distributed users[C]. 2018 IEEE International Conference on Communication Systems (ICCS). Chengdu, 2018.
[2]SAADI A A, SOUKANE A, MERAIHI Y, et al.UAV path planning using optimization approaches: a survey[J].Archives of Computational Methods in Engineering, 2022, 29(6):4233-4284
[3]庞磊, 曹志强, 喻俊志.基于A*和TEB融合的行人感知无碰跟随方法[J].航空学报, 2021, 42(4):524909--
[4]REN Z, RATHINAM, LIKHACHEV M, et al.Multi-objective path-based D* lite[J].IEEE Robotics and Automation Letters, 2022, 7(2):3318-3325
[5]HUANG H, SHANG Y, LIU X, et al.An improved Bi-RRT*-based path planning algorithm with adaptive search strategy assignment mechanism for ultra-low-altitude penetration of fixed-wing aircraft[J].Aerospace Science and Technology, 2024, 152(-):109363--
[6]YANG H, XU X, HONG J.Automatic parking path planning of tracked vehicle based on improved A* and DWA algorithms[J].IEEE Transactions on Transportation Electrification, 2023, 9(1):283-292
[7]SHENG H, ZHANG J, YAN Z, et al.New multi-UAV formation keeping method based on improved artificial potential field[J].Chinese Journal of Aeronautics, 2023, 36(11):249-270
[8]SHIN Y, KIM E.Hybrid path planning using positioning risk and artificial potential fields[J].Aerospace Science and Technology, 2021, 112(-):106640--
[9]于全友, 徐止政, 段纳, 等.基于改进ACO的带续航约束无人机全覆盖作业路径规划[J].航空学报, 2023, 44(12):327856--
[10]LI Y, ZHANG L, CAI B, et al.Unified path planning for composite UAVs via Fermat point-based grouping particle swarm optimization[J].Aerospace Science and Technology, 2024, 148(-):109088--
[11]JIANG W, LYU Y, LI Y, et al.UAV path planning and collision avoidance in 3D environments based on POMPD and improved grey wolf optimizer[J].Aerospace Science and Technology, 2022, 121(-):107314--
[12]周彬, 郭艳, 李宁, 等.基于导向强化Q学习的无人机路径规划[J].航空学报, 2021, 42(9):325109--
[13]SCHLICHTING M R, NOTTER S, FICHTER W.Long short-term memory for spatial encoding in multi-agent path planning[J].Journal of Guidance, Control, and Dynamics, 2022, 45(5):952-961
[14]ZHANG S, ZENG Y, ZHANG R.Cellular-enabled UAV communication: A connectivity-constrained trajectory optimization perspective[J].IEEE Transactions on Communications, 2019, 67(3):2580-2604
[15]FONTANESI G, ZHU A, ARVANEH M, et al.A transfer learning approach for UAV path design with connectivity outage constraint[J].IEEE Internet of Things Journal, 2023, 10(6):4998-5012
[16]WANG X, GURSOY M.Learning-based UAV trajectory optimization with collision avoidance and connectivity constraints[J].IEEE Transactions on Wireless Communications, 2022, 21(6):4350-4363
[17]NGUYEN K K, DUONG T Q, DO-DUY T, et al.3D UAV trajectory and data collection optimization via deep reinforcement learning[J].IEEE Transactions on Communications, 2022, 70(4):2358-2371
[18]WANG X, YI M, LIU J, et al.Cooperative data collection with multiple UAVs for information freshness in the internet of things[J].IEEE Transactions on Communications, 2023, 71(5):2740-2755
[19]张薇, 何若俊.面向物联网数据收集的无人机自主路径规划[J].航空学报, 2024, 45(8):329054--
[20]WANG L, WANG K, PAN C, et al.Deep reinforcement learning based dynamic trajectory control for UAV-assisted mobile edge computing[J].IEEE Transactions on Mobile Computing, 2022, 21(10):3536-3550
[21]ZHANG Y, MOU Z, GAO F, et al.UAV-enabled secure communications by multi-agent deep reinforcement learning[J].IEEE Transactions on Vehicular Technology, 2020, 69(10):11599-11611
[22]雷耀麟, 丁文锐, 罗祎喆, 等.无人机数据采集任务中的航迹与资源优化[J].北京航空航天大学学报, 2024, 在线发表(-):---
[23]BANACIA A S, BRIOSO J G, SAWADA H, et al.Experimental verification of ITU-R P.1411 as path loss prediction model for IEEE 802.11af[C]. 21st International Symposium on Wireless Personal Multimedia Communications (WPMC). Chiang Rai, 2018.
[24]王雪松, 王荣荣, 程玉虎.基于表征学习的离线强化学习方法研究综述[J].自动化学报, 2024, 50(6):1104-1128
[25]YU C, VELU A, VINITSKY E, et al.The surprising effectiveness of PPO in cooperative multi-agent games[J].Advances in Neural Information Processing Systems, 2022, 35(-):24611-24624
[26]WU G, LIU Z, FAN M, et al.Joint offloading and resource allocation for scalable vehicular edge computing[C]. 2020 IEEE 92nd Vehicular Technology Conference (VTC2020-Fall), Victoria, 2020.
[27]SCHULMAN J, MORTIZ P, LEVINE S, et al.High-dimensional continuous control using generalized advantage estimation[C]. International Conference on Learning Representations, San Juan, 2021.
文章导航

/