航空学报 > 2026, Vol. 47 Issue (3): 631880-631880   doi: 10.7527/S1000-6893.2025.31880

目标状态协同与智能感知专栏

面向移动边缘网络的多目标进化深度确定性策略梯度算法

张磊1,2, 田灿2, 文方青2(), 张清河2, 刘含2   

  1. 1.三峡大学 湖北省水电工程智能视觉监测重点实验室,宜昌 443000
    2.三峡大学 计算机与信息学院,宜昌 443000
  • 收稿日期:2025-02-20 修回日期:2025-04-14 接受日期:2025-05-06 出版日期:2025-05-14 发布日期:2025-05-13
  • 通讯作者: 文方青 E-mail:wenfangqing@ctgu.edu.cn
  • 基金资助:
    国家自然科学基金(62271286);国家自然科学基金(62371271);国家自然科学基金(42406173);湖北省水电工程智能视觉监测重点实验室开放课题(2024SDSJ02)

Multi-objective evolution with deep deterministic strategy gradient algorithm for mobile edge networks

Lei ZHANG1,2, Can TIAN2, Fangqing WEN2(), Qinghe ZHANG2, Han LIU2   

  1. 1.Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering,China Three Gorges University,Yichang 443000,China
    2.College of Computer and Information Technology,China Three Gorges University,Yichang 443000,China
  • Received:2025-02-20 Revised:2025-04-14 Accepted:2025-05-06 Online:2025-05-14 Published:2025-05-13
  • Contact: Fangqing WEN E-mail:wenfangqing@ctgu.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62271286);Open Fund From Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering(2024SDSJ02)

摘要:

无人机(UAV)辅助的移动边缘计算(MEC)网络在应急响应与实时监测等领域展现出极大潜力。然而,MEC网络的高效运行却面临着能耗高、时延大等多重优化目标的挑战。为此,提出了一种面向UAV辅助MEC网络优化的多目标进化深度确定性策略梯度(MOE-DDPG)算法。首先,建立了一种集成的多目标优化模型,通过最小化MEC网络的时延和能耗,同时最大化UAV的任务完成数量,来保障MEC网络的综合性能。其次,针对传统深度确定性策略梯度(DDPG)算法在处理多目标优化问题时难以充分权衡各个目标的难题,提出一种用于权重向量与个体匹配的双向选择策略,从而大幅增强种群的多样性。最后,在有机融合多目标进化(MOE)算法和DDPG算法的基础上,提出了一种新颖的MOE-DDPG算法框架,该算法能够实时优化MEC网络的整体性能。实验结果表明,MOE-DDPG算法不仅在提升Pareto解集的分布性和收敛性方面作用明显,而且在同时降低能耗、时延以及提高任务完成数量方面效果显著。

关键词: 深度强化学习, 移动边缘计算, 无人机, 多目标进化, 双向选择

Abstract:

The Mobile Edge Computing (MEC) network assisted by Unmanned Aerial Vehicles (UAV)demonstrates great potential in emergency response, real-time monitoring, and other fields. However, the efficient operation of MEC network encounters challenges stemming from multiple optimization objectives, such as high energy consumption and high latency. Therefore, a Multi-Objective Evolution with Deep Deterministic Policy Gradient (MOE-DDPG) algorithm for UAV-assisted MEC network optimization is introduced. Firstly, an integrated multi-objective optimization model is established to ensure comprehensive performance of the MEC network by minimizing latency and energy consumption while maximizing the number of completed UAV tasks. Secondly, a bidirectional selection strategy for weight vector and individual matching is proposed to address the difficulty of balancing various objectives in traditional Deep Deterministic Policy Gradient (DDPG) algorithms when dealing with multi-objective optimization problems, thereby significantly enhancing population diversity. Finally, by organically fusing the Multi-Objective Evolution (MOE) algorithm and DDPG algorithm, a novel MOE-DDPG algorithm framework is proposed, which can optimize the overall performance of the MEC network in real time. The experimental results show that the MOE-DDPG algorithm not only significantly improves the distribution and convergence of the Pareto solution set but also effectively reduces energy consumption, latency, and increases the number of completed tasks.

Key words: deep reinforcement learning, Mobile Edge Computing (MEC), unmanned aerial vehicle, Multi-Objective Evolution (MOE), bidirectional selection

中图分类号: