面向移动边缘网络的多目标进化深度确定性策略梯度算法

doi:10.7527/S1000-6893.2025.31880

目标状态协同与智能感知专栏

本期目录 | 过刊浏览 | 高级检索

前一篇 |

面向移动边缘网络的多目标进化深度确定性策略梯度算法

张磊¹^,², 田灿², 文方青²(), 张清河², 刘含²

^1.三峡大学湖北省水电工程智能视觉监测重点实验室，宜昌 443000
^2.三峡大学计算机与信息学院，宜昌 443000

收稿日期:2025-02-20 修回日期:2025-04-14 接受日期:2025-05-06 出版日期:2025-05-14 发布日期:2025-05-13
通讯作者: 文方青 E-mail:wenfangqing@ctgu.edu.cn
基金资助:
国家自然科学基金(62271286);国家自然科学基金(62371271);国家自然科学基金(42406173);湖北省水电工程智能视觉监测重点实验室开放课题(2024SDSJ02)

Multi-objective evolution with deep deterministic strategy gradient algorithm for mobile edge networks

Lei ZHANG¹^,², Can TIAN², Fangqing WEN²(), Qinghe ZHANG², Han LIU²

^1.Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering，China Three Gorges University，Yichang 443000，China
^2.College of Computer and Information Technology，China Three Gorges University，Yichang 443000，China

Received:2025-02-20 Revised:2025-04-14 Accepted:2025-05-06 Online:2025-05-14 Published:2025-05-13
Contact: Fangqing WEN E-mail:wenfangqing@ctgu.edu.cn
Supported by:
National Natural Science Foundation of China(62271286);Open Fund From Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering(2024SDSJ02)

摘要/Abstract

摘要：

无人机（UAV）辅助的移动边缘计算（MEC）网络在应急响应与实时监测等领域展现出极大潜力。然而，MEC网络的高效运行却面临着能耗高、时延大等多重优化目标的挑战。为此，提出了一种面向UAV辅助MEC网络优化的多目标进化深度确定性策略梯度（MOE-DDPG）算法。首先，建立了一种集成的多目标优化模型，通过最小化MEC网络的时延和能耗，同时最大化UAV的任务完成数量，来保障MEC网络的综合性能。其次，针对传统深度确定性策略梯度（DDPG）算法在处理多目标优化问题时难以充分权衡各个目标的难题，提出一种用于权重向量与个体匹配的双向选择策略，从而大幅增强种群的多样性。最后，在有机融合多目标进化（MOE）算法和DDPG算法的基础上，提出了一种新颖的MOE-DDPG算法框架，该算法能够实时优化MEC网络的整体性能。实验结果表明，MOE-DDPG算法不仅在提升Pareto解集的分布性和收敛性方面作用明显，而且在同时降低能耗、时延以及提高任务完成数量方面效果显著。

关键词: 深度强化学习, 移动边缘计算, 无人机, 多目标进化, 双向选择

Abstract:

The Mobile Edge Computing （MEC） network assisted by Unmanned Aerial Vehicles （UAV）demonstrates great potential in emergency response， real-time monitoring， and other fields. However， the efficient operation of MEC network encounters challenges stemming from multiple optimization objectives， such as high energy consumption and high latency. Therefore， a Multi-Objective Evolution with Deep Deterministic Policy Gradient （MOE-DDPG） algorithm for UAV-assisted MEC network optimization is introduced. Firstly， an integrated multi-objective optimization model is established to ensure comprehensive performance of the MEC network by minimizing latency and energy consumption while maximizing the number of completed UAV tasks. Secondly， a bidirectional selection strategy for weight vector and individual matching is proposed to address the difficulty of balancing various objectives in traditional Deep Deterministic Policy Gradient （DDPG） algorithms when dealing with multi-objective optimization problems， thereby significantly enhancing population diversity. Finally， by organically fusing the Multi-Objective Evolution （MOE） algorithm and DDPG algorithm， a novel MOE-DDPG algorithm framework is proposed， which can optimize the overall performance of the MEC network in real time. The experimental results show that the MOE-DDPG algorithm not only significantly improves the distribution and convergence of the Pareto solution set but also effectively reduces energy consumption， latency， and increases the number of completed tasks.

Key words: deep reinforcement learning, Mobile Edge Computing (MEC), unmanned aerial vehicle, Multi-Objective Evolution (MOE), bidirectional selection

中图分类号:

V249

张磊, 田灿, 文方青, 张清河, 刘含. 面向移动边缘网络的多目标进化深度确定性策略梯度算法[J]. 航空学报, 2026, 47(3): 631880.

Lei ZHANG, Can TIAN, Fangqing WEN, Qinghe ZHANG, Han LIU. Multi-objective evolution with deep deterministic strategy gradient algorithm for mobile edge networks[J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(3): 631880.

图/表 9

图 1

图 2

图 3

表 1

实验参数

参数	符号	数值
UAV任务队列容量	$T u$	10
最大方位角/（°）	$θ m a x$	π/4
旋翼叶片剖面功率/W	$P s$	79.86
旋翼叶片尖端速度/（m·s^-1）	$U t i p$	120
诱导功率/W	$P d$	88.63
平均诱导速度/（m·s^-1）	$v 0$	4.03
机身阻力系数	$C D$	0.6
空气密度/（kg·m^-3）	$ρ$	1.225
旋翼固有比	$r c$	0.05
旋翼面积/m²	$μ$	0.503
信道带宽/MHz	$B$	10
UAV发射功率/W	$P T$	1
UAV与BS之间的信道噪声功率/（10^-6 W）	$σ u b$	1
IoTTs发射功率/W	$P u p$	1
UAV计算能力/GHz	$f U$	1
有效电容系数/10^-26	$κ$	1
折扣因子	$γ$	0.995
软更新比例	$ε$	0.005
正常数	$φ$	2 000
准备阶段最大更新次数	$U s$	60
进化阶段最大更新次数	$U e$	10
最大进化代数	$G m a x$	100

表 1

图 4

图 5

表 2

图 6

图 7

参考文献 36

[1]	CAO B Q， YE H F， LIU J X， et al. SMART： Cost-aware service migration path selection based on deep reinforcement learning［J］. IEEE Transactions on Intelligent Transportation Systems， 2024， 25（9）： 12421-12436.
[2]	CHENG S Y， REN T， ZHANG H， et al. A Stackelberg-game-based framework for edge pricing and resource allocation in mobile edge computing［J］. IEEE Internet of Things Journal， 2024， 11（11）： 20514-20530.
[3]	李伟，郭艳，李宁，等. 智能反射面辅助无人机移动边缘计算任务数据最大化方法［J］. 航空学报， 2023， 44（19）： 328486.
	LI W， GUO Y， LI N， et al. Intelligent reflector surface assisted UAV mobile edge computing task data maximization method［J］. Acta Aeronautica et Astronautica Sinica， 2023， 44（19）： 328486 （in Chinese）.
[4]	SHAH Z， JAVED U， NAEEM M， et al. Mobile edge computing （MEC）-enabled UAV placement and computation efficiency maximization in disaster scenario［J］. IEEE Transactions on Vehicular Technology， 2023， 72（10）： 13406-13416.
[5]	ZHAI J H， BI J， YUAN H T， et al. Cost-minimized microservice migration with autoencoder-assisted evolution in hybrid cloud and edge computing systems［J］. IEEE Internet of Things Journal， 2024， 11（24）： 40951-40967.
[6]	李伟，郭艳，何明，等. 满意度驱动下无人机移动边缘计算服务缓存和资源分配方法［J］. 航空学报， 2024， 45（19）： 330017.
	LI W， GUO Y， HE M， et al. Satisfaction-driven services caching and resource allocation for UAV mobile edge computing［J］. Acta Aeronautica et Astronautica Sinica， 2024， 45（19）： 330017 （in Chinese）.
[7]	HUI M， CHEN J， YANG L， et al. UAV-assisted mobile edge computing： Optimal design of UAV altitude and task offloading［J］. IEEE Transactions on Wireless Communications， 2024， 23（10）： 13633-13647.
[8]	屈毓锛，秦蓁，马靖豪，等. 面向空地协同移动边缘计算的服务布置策略［J］. 计算机学报， 2022， 45（4）： 781-797.
	QU Y B， QIN Z， MA J H， et al. Service provisioning for air-ground collaborative mobile edge computing［J］. Chinese Journal of Computers， 2022， 45（4）： 781-797 （in Chinese）.
[9]	CAO L M， HUO T， LI S B， et al. Cost optimization in edge computing： A survey［J］. Artificial Intelligence Review， 2024， 57（11）： 312.
[10]	CHAI Z Y， LIU X， LI Y L. A computation offloading algorithm based on multi-objective evolutionary optimization in mobile edge computing［J］. Engineering Applications of Artificial Intelligence， 2023， 121： 105966.
[11]	CUI Y Y， ZHANG D G， ZHANG T， et al. A novel offloading scheduling method for mobile application in mobile edge computing［J］. Wireless Networks， 2022， 28（6）： 2345-2363.
[12]	FU S， ZHOU F H， HU R Q Y. Resource allocation in a relay-aided mobile edge computing system［J］. IEEE Internet of Things Journal， 2022， 9（23）： 23659-23669.
[13]	EJAZ M， GUI J S， ASIM M， et al. RL-planner： Reinforcement learning-enabled efficient path planning in multi-UAV MEC systems［J］. IEEE Transactions on Network and Service Management， 2024， 21（3）： 3317-3329.
[14]	CHEN C， GONG S M， ZHANG W J， et al. DRL-based contract incentive for wireless-powered and UAV-assisted backscattering MEC system［J］. IEEE Transactions on Cloud Computing， 2024， 12（1）： 264-276.
[15]	LI B， LIU Y F， TAN L， et al. Digital twin assisted task offloading for aerial edge computing and networks［J］. IEEE Transactions on Vehicular Technology， 2022， 71（10）： 10863-10877.
[16]	WANG L Y， ZHANG G L. Joint service caching， resource allocation and computation offloading in three-tier cooperative mobile edge computing system［J］. IEEE Transactions on Network Science and Engineering， 2023， 10（6）： 3343-3353.
[17]	LI M， YU F R， SI P B， et al. UAV-assisted data transmission in blockchain-enabled M2M communications with mobile edge computing［J］. IEEE Network， 2020， 34（6）： 242-249.
[18]	ZHANG S X， CAO R Y. Multi-objective optimization for UAV-enabled wireless powered IoT networks： An LSTM-based deep reinforcement learning approach［J］. IEEE Communications Letters， 2022， 26（12）： 3019-3023.
[19]	LI L P， GUAN W Q， ZHAO C， et al. Trajectory planning， phase shift design， and IoT devices association in flying-RIS-assisted mobile edge computing［J］. IEEE Internet of Things Journal， 2024， 11（1）： 147-157.
[20]	ZHANG S， JIN H L， GUO P K. IRS-assisted energy efficient communication for UAV mobile edge computing［J］. Computer Networks， 2024， 246： 110387.
[21]	AL-HILO A， SAMIR M， ELHATTAB M， et al. RIS-assisted UAV for timely data collection in IoT networks［J］. IEEE Systems Journal， 2023， 17（1）： 431-442.
[22]	SHI J L， LI C Y， GUAN Y C， et al. Multi-UAV-assisted computation offloading in DT-based networks： A distributed deep reinforcement learning approach［J］. Computer Communications， 2023， 210： 217-228.
[23]	ZHAO X， ZHAO T H， WANG F Y， et al. SAC-based UAV mobile edge computing for energy minimization and secure data transmission［J］. Ad Hoc Networks， 2024， 157： 103435.
[24]	LI B， XIE W C， YE Y H， et al. FlexEdge： Digital twin-enabled task offloading for UAV-aided vehicular edge computing［J］. IEEE Transactions on Vehicular Technology， 2023， 72（8）： 11086-11091.
[25]	JIANG W W， AI B， LI M S， et al. Aerial-IRSs-assisted energy-efficient task offloading and computing［J］. IEEE Internet of Things Journal， 2024， 11（11）： 20178-20193.
[26]	YU Y， TANG J， HUANG J Y， et al. Multi-objective optimization for UAV-assisted wireless powered IoT networks based on extended DDPG algorithm［J］. IEEE Transactions on Communications， 2021， 69（9）： 6361-6374.
[27]	ZHOU C H， WU W， HE H L， et al. Deep reinforcement learning for delay-oriented IoT task scheduling in SAGIN［J］. IEEE Transactions on Wireless Communications， 2021， 20（2）： 911-925.
[28]	季薇，杨许鑫，李飞，等. 无人机辅助MEC系统中基于最优SIC顺序的能耗优化方案［J］. 通信学报， 2024， 45（2）： 18-30.
	JI W， YANG X X， LI F， et al. Energy consumption optimization scheme in UAV-assisted MEC system based on optimal SIC order［J］. Journal on Communications， 2024， 45（2）： 18-30 （in Chinese）.
[29]	YI X J， YU H Y， XU T. Solving multi-objective weapon-target assignment considering reliability by improved MOEA/D-AM2M［J］. Neurocomputing， 2024， 563： 126906.
[30]	WANG P F， YANG H， HAN G J， et al. Decentralized navigation with heterogeneous federated reinforcement learning for UAV-enabled mobile edge computing［J］. IEEE Transactions on Mobile Computing， 2024， 23（12）： 13621-13638.
[31]	HAO H， XU C Q， ZHANG W， et al. Joint task offloading， resource allocation， and trajectory design for multi-UAV cooperative edge computing with task priority［J］. IEEE Transactions on Mobile Computing， 2024， 23（9）： 8649-8663.
[32]	PERVEZ F， SULTANA A， YANG C G， et al. Energy and latency efficient joint communication and computation optimization in a multi-UAV-assisted MEC network［J］. IEEE Transactions on Wireless Communications， 2024， 23（3）： 1728-1741.
[33]	SONG F H， DENG M S， XING H L， et al. Energy-efficient trajectory optimization with wireless charging in UAV-assisted MEC based on multi-objective reinforcement learning［J］. IEEE Transactions on Mobile Computing， 2024， 23（12）： 10867-10884.
[34]	JIN J L， XU Y J. Optimal policy characterization enhanced proximal policy optimization for multitask scheduling in cloud computing［J］. IEEE Internet of Things Journal， 2022， 9（9）： 6418-6433.
[35]	SHEN S G， HAO X B， GAO Z J， et al. SAC-PP： Jointly optimizing privacy protection and computation offloading for mobile edge computing［J］. IEEE Transactions on Network and Service Management， 2024， 21（6）： 6190-6203.
[36]	TAN L， GUO S T， ZHOU P Z， et al. HAT： Task offloading and resource allocation in RIS-assisted collaborative edge computing［J］. IEEE Transactions on Network Science and Engineering， 2024， 11（5）： 4665-4678.

E-mail：hkxb@buaa.edu.cn

关于我们

期刊社服务

专业学科

封面文章

友情链接

主管单位：中国科学技术协会主办单位：中国航空学会北京航空航天大学

算法	ACOI	ATD	AEC	ANTC
MOE-DDPG	592.242 1（+8.5%）	102.614 2（+12.6%）	324.237 1（+10.3%）	348.666 7（+22.5%）
MOE-PPO	646.971 1	117.368 7	351.751 8（+2.7%）	284.733 3
MOE-SAC	617.546 4（+4.5%）	112.478 0（+4.2%）	351.142 9（+2.8%）	328.666 7（+15.4%）
MOE-TD3	608.619 5（+5.9%）	91.111 9（+22.4%）	361.424 3	327.533 3（+15.0%）

面向移动边缘网络的多目标进化深度确定性策略梯度算法

Multi-objective evolution with deep deterministic strategy gradient algorithm for mobile edge networks

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 36

相关文章 15

编辑推荐

Metrics

本文评价

[1]	王浩宇, 张泽旭, 闻单, 刘金龙, 朱倍孝, 包为民. 基于时序耦合分析的无人机集群任务分配方法[J]. 航空学报, 2026, 47(2): 332075-332075.
[2]	赵长啸, 方玉麟, 汪克念. 基于BiTCN的无人机指挥控制链路DoS攻击检测方法[J]. 航空学报, 2026, 47(1): 332048-332048.
[3]	贺炅, 任斌武, 杜思亮, 徐尤松, 王博. 基于ADRC-RBF倾转四旋翼无人机姿态自适应控制[J]. 航空学报, 2025, 46(S1): 732189-732189.
[4]	周攀, 李霓, 黄江涛, 杨青林, 廉云霄. 非完备信息下无人机近距博弈自主决策[J]. 航空学报, 2025, 46(S1): 732215-732215.
[5]	虞翔宇, 李文, 严杰, 梁世哲. 无人机液氢燃料电池热管理系统仿真研究[J]. 航空学报, 2025, 46(9): 630964-630964.
[6]	杨芃芊, 陈禹彤, 刘俊辉, 杨杰豪, 单家元, 孙士珺. 串列翼货运无人机大攻角气动与操稳特性[J]. 航空学报, 2025, 46(9): 131056-131056.
[7]	李荣祖, 刘莉, 杨盾. 基于多源域融合代理模型的氢能无人机优化设计[J]. 航空学报, 2025, 46(9): 630979-630979.
[8]	万开方, 吴志林, 武韫晖, 强皓植, 吴艺博, 李波. 拒止环境下基于深度强化学习的多无人机协同定位[J]. 航空学报, 2025, 46(8): 331024-331024.
[9]	姜凌峰, 李新凯, 张海, 李涵玮, 张宏立. 基于改进TD3算法的无人机动态环境无地图导航[J]. 航空学报, 2025, 46(8): 331035-331035.
[10]	向锦武, 马凯, 阚梓, 李道春, 郑可欣, 陈汉轩. 氢能源无人机关键技术研究进展[J]. 航空学报, 2025, 46(5): 531603-531603.
[11]	丁奇帅, 雷帮军, 吴正平. 基于孪生网络的轻量型无人机单目标跟踪算法[J]. 航空学报, 2025, 46(4): 330925-330925.
[12]	吴付杰, 王博文, 齐静雅, 曹铭智, 桑英俊, 李晟, 张玉珍, 陈钱, 左超. 机载多孔径全景图像合成技术研究进展[J]. 航空学报, 2025, 46(3): 630505-630505.
[13]	马诺, 卫社春, 孟军辉, 刘清洋, 雷宇声. 考虑减速伞作用的无人机内埋舱体分离流场特性与动力学[J]. 航空学报, 2025, 46(3): 130755-130755.
[14]	杨敏, 刘关俊, 周子渊. 基于安全强化学习的月球着陆器控制[J]. 航空学报, 2025, 46(3): 630553-630553.
[15]	吴一全, 童康. 基于深度学习的无人机航拍图像小目标检测研究进展[J]. 航空学报, 2025, 46(3): 30848-030848.