基于强化学习算法的单一决策模型在面对复杂无人机编队控制任务时往往由于自主决策能力有限导致适应性不足,对此,本文提出了一种以虚拟结构法引领深度强化学习算法的分布式决策方法。首先,为降低强化学习算法在多样性任务环境中进行策略寻优的难度,对总体任务进行功能分解,分别针对静态障碍、随机障碍及通讯干扰等单一作业场景实施局部任务规划,构建多个决策子模型,并设计模型间自主调用流程;然后,以增加引导作用为出发点将虚拟结构法与SAC强化学习算法结合,构建分布式决策框架,通过对各子模型的分散训练充分提高任务执行的成功率和灵活性;最后,采用集中执行的方式,由环境变化作为触发条件进行子模型的动态选择与无缝切换,实现无人机编队能够自主根据任务环境的变化灵活调整队形,达成任务目标的同时显著提升机群整体对环境的适应性以及生存能力,并通过多场景下的仿真实验验证方法的有效性。
In single decision-making models based on reinforcement learning algorithms, the adaptability is often insufficient when handling complex UAV formation tasks due to limited autonomous decision-making capabilities. To address this, this paper proposes a distributed decision-making method guided by the virtual structure approach integrated with a deep reinforcement learning algorithm. First, to reduce the difficulty of strategy optimization for reinforcement learning algorithms in diverse task environments, the overall task is functionally decomposed. Local task planning is then implemented for individual task scenarios, such as static obstacles, random obstacles, and communication interference, and multiple decision sub-models are constructed, along with the design of the calling process between these models. Next, with the goal of increasing guidance, the virtual structure method is integrated with the SAC reinforcement learning algorithm to build a distributed decision-making framework. Through decentralized training of each sub-model, the success rate and flexibility of task execution are significantly improved. Finally, a centralized execution approach is adopted, where environmental changes serve as the triggering condition for the dynamic selection and seamless switching of sub-models. This allows the UAV formation to autonomously adjust its formation according to changes in the task environment, achieving the mission objectives while significantly enhancing the overall adaptability and survivability of the swarm. The effectiveness of the method is validated through simulation experiments in multiple scenarios.