基于时间窗约束的无人机完整性数据采集路径规划算法

doi:10.7527/S1000-6893.2025.32451

电子电气工程与控制

本期目录 | 过刊浏览 | 高级检索

前一篇 |

基于时间窗约束的无人机完整性数据采集路径规划算法

高思华, 赵炳阳, 李建伏()

中国民航大学计算机科学与技术学院，天津 300300

收稿日期:2025-06-20 修回日期:2025-07-04 接受日期:2025-08-25 出版日期:2025-09-09 发布日期:2025-09-05
通讯作者: 李建伏 E-mail:jfli@cauc.edu.cn
基金资助:
国家自然科学基金(62173332)

UAV complete data collection trajectory planning algorithm based on time window constraints

Sihua GAO, Bingyang ZHAO, Jianfu LI()

College of Computer Science and Technology，Civil Aviation University of China，Tianjin 300300，China

Received:2025-06-20 Revised:2025-07-04 Accepted:2025-08-25 Online:2025-09-09 Published:2025-09-05
Contact: Jianfu LI E-mail:jfli@cauc.edu.cn
Supported by:
National Natural Science Foundation of China(62173332)

摘要/Abstract

摘要：

无人机（UAV）已广泛应用于辅助无线传感器网络（WSNs）完成数据采集任务。然而，节点的时间窗约束给其带来了新的挑战，无人机不仅需要在特定时间窗内飞行至各待上传数据的节点周围，还必须在节点时间窗关闭前完成数据采集任务。不合理的路径规划会导致无人机飞行距离增加，无法保障数据的完整性采集。虽然提升飞行速度可缩短飞行时间，但无人机能量消耗过快易导致数据采集任务失败。为了解决以上问题，面向部署激光充电站的无线传感器网络数据采集场景，提出了基于时间窗约束的无人机完整性数据采集路径规划问题并进行数学建模。设计了一种基于混合动作层次表示模型的强化学习框架（H-HyAR），联合优化无人机对目标节点的访问次序、悬停偏移和飞行速度，并挖掘三者间的层次依赖关系，从而最小化无人机在数据采集任务中的飞行距离。仿真实验结果表明，H-HyAR算法在无人机飞行距离以及影响该指标因素的对比实验中的表现均优于其他3种混合动作强化学习算法和近端策略优化（PPO）算法，且具有良好的鲁棒性和泛化能力。

关键词: 无人机路径规划, 混合动作层次表示模型, 深度强化学习, 时间窗, 完整性数据采集, 无线传感器网络

Abstract:

Unmanned Aerial Vehicle （UAV） has been widely adopted to assist Wireless Sensor Networks （WSNs） in performing data collection tasks. However， time window constraints at the sensor nodes pose new challenges. The UAV must not only arrive in the vicinity of each data-transmitting node within its designated time window， but also complete the data collection task before the window closes. Inefficient trajectory planning increases the UAV’s flight distance， which may compromise the completeness of data collection. Although increasing flight speed can shorten travel time， it also accelerates energy consumption， potentially leading to task failure. To address these problems， We formulate a mathematical model for the UAV trajectory planning problem in a time-window-constrained complete data collection scenario， and then propose a reinforcement learning framework based on a Hierarchical Hybrid Action Representation （H-HyAR） to jointly optimize the UAV’s visiting order of target nodes， hovering offset， and flight speed， while capturing the hierarchical dependencies among these factors to minimize the UAV’s flight distance during the data collection task. Experiment results demonstrate that the H-HyAR algorithm outperforms three comparative hybrid action reinforcement learning algorithms and the Proximal Policy Optimization （PPO） algorithm in terms of flight distance and the influencing factors of this metric， while also exhibiting strong robustness and generalization capabilities.

Key words: unmanned aerial vehicle trajectory planning, hierarchical hybrid action representation, deep reinforcement learning, time window, complete data collection, wireless sensor networks

中图分类号:

V279

高思华, 赵炳阳, 李建伏. 基于时间窗约束的无人机完整性数据采集路径规划算法[J]. 航空学报, 2026, 47(6): 332451.

Sihua GAO, Bingyang ZHAO, Jianfu LI. UAV complete data collection trajectory planning algorithm based on time window constraints[J]. Acta Aeronautica et Astronautica Sinica, 2026, 47(6): 332451.

图/表 16

图 1

图 2

图 3

表1

仿真参数

参数	取值	参数	取值
$E m a x$ /kJ	$50$	$v m a x$ （m·s^-1）	30
$H$ /m	10	$D$ /KB	10
$R s$ /m	30	$R l$ /m	40
$α, β$	10， 0.6	$W$ /MHz	1
$P d$ /dBm	-20	$ς$	0.2
$ξ$ /dB	-30	$ϖ 2$ /dBm	-90
$ε$	2.3	$P p r o$ /W	79.86
$P i n d$ /W	88.63	$U t i p$ /（m·s^-1）	120
$κ ˜$	1	$V$ /（m·s^-1）	4.03
$d ˜$	0.6	$ϱ$ /（kg·m^-3）	1.225
$σ r$	0.05	$S A$ /m²	0.503
$P L$ /kW	2	$η$	0.15
$δ$	$10 - 6$	$λ c$	$1 × 106$
$λ e$	$1 × 10 - 3$	$λ d$	$1 × 10 - 1$

表1

图 4

表2

表3

图 5

表4

图 6

图 7

图 8

图 9

表5

不同采集奖励参数下各指标对比

时间窗长度/s	$λ c$ /10⁶	飞行距离/km	完成率/%
400	$1$	22.16	92.1
	$3$	22.76	91.7
	$5$	22.36	91.9
600	$1$	20.87	94.2
	$3$	21.34	93.8
	$5$	21.17	94.1
800	$1$	19.74	95.7
	$3$	20.19	95.3
	$5$	20.14	95.5

表5

表6

不同充能奖励参数下各指标对比

时间窗长度/s	$λ e$ /10 $- 3$	飞行距离/km	完成率/%
400	$1$	22.16	92.1
	$3$	23.05	91.8
	$5$	22.64	91.5
600	$1$	20.87	94.2
	$3$	21.41	93.9
	$5$	20.95	94.0
800	$1$	19.74	95.7
	$3$	20.46	95.5
	$5$	20.55	95.1

表6

表7

不同结束奖励参数下各指标对比

时间窗长度/s	$λ d$ /10^-1	飞行距离/km	完成率/%
400	$1$	22.16	92.1
	$3$	22.39	91.9
	$5$	22.53	91.5
600	$1$	20.87	94.2
	$3$	20.89	93.8
	$5$	21.23	93.7
800	$1$	19.74	95.7
	$3$	19.97	95.6
	$5$	20.07	95.2

表7

参考文献 29

[1]	YANG M， BI W H， ZHANG A， et al. A distributed task reassignment method in dynamic environment for multi-UAV system［J］. Applied Intelligence， 2022， 52（2）： 1582-1601.
[2]	SAMIR M， SHARAFEDDINE S， ASSI C M， et al. UAV trajectory planning for data collection from time-constrained IoT devices［J］. IEEE Transactions on Wireless Communications， 2020， 19（1）： 34-46.
[3]	CUI W， LI R L， FENG Y X， et al. Distributed task allocation for a multi-UAV system with time window constraints［J］. Drones， 2022， 6（9）： 226.
[4]	WAN P F， WANG S K， XU G Y， et al. Hybrid heuristic-based multi-UAV route planning for time-dependent data collection［J］. IEEE Internet of Things Journal， 2024， 11（13）： 24134-24147.
[5]	CHAPNEVIS A， BULUT E. Time-efficient approximate trajectory planning for AoI-centered multi-UAV IoT networks［J］. Internet of Things， 2025， 29： 101461.
[6]	张薇，何若俊. 面向物联网数据收集的无人机自主路径规划［J］. 航空学报， 2024， 45（8）： 329054.
	ZHANG W， HE R J. Autonomous trajectory design for IoT data collection by UAV［J］. Acta Aeronautica et Astronautica Sinica， 2024， 45（8）： 329054 （in Chinese）.
[7]	KUO H A， SHEU J P， VAN CUONG N. Profit maximization for UAV trajectory planning in time-constrained data collection［C］∥ICC 2023-IEEE International Conference on Communications. Piscataway： IEEE Press， 2023： 5413-5418.
[8]	LIU K， ZHENG J. UAV trajectory planning with interference awareness in UAV-enabled time-constrained data collection systems［J］. IEEE Transactions on Vehicular Technology， 2024， 73（2）： 2799-2815.
[9]	LIAU Y S， HONG Y W P， SHEU J P. Laser-powered UAV trajectory and charging optimization for sustainable data-gathering in the Internet of Things［J］. IEEE Transactions on Mobile Computing， 2025， 24（5）： 4278-4295.
[10]	LUO C W， LIU N， HOU Y N， et al. Trajectory optimization of laser-charged UAV to minimize the average age of information for wireless rechargeable sensor network［J］. Theoretical Computer Science， 2023， 945： 113680.
[11]	乌兰，刘全，黄志刚，等. 离线强化学习研究综述［J］. 计算机学报， 2025， 48（1）： 156-187.
	WU L， LIU Q， HUANG Z G， et al. A review of research on offline reinforcement learning［J］. Chinese Journal of Computers， 2025， 48（1）： 156-187 （in Chinese）.
[12]	WAN P F， XU G Y， CHEN J W， et al. Deep reinforcement learning enabled multi-UAV scheduling for disaster data collection with time-varying value［J］. IEEE Transactions on Intelligent Transportation Systems， 2024， 25（7）： 6691-6702.
[13]	CAI M， FAN S， XIAO G， et al. Deep reinforcement learning-based UAV path planning algorithm in agricultural time-constrained data collection［J］. Advances in Electrical and Computer Engineering， 2023， 23（2）： 101-108.
[14]	WANG J Y， LI S Y， CHEN D Z， et al. Flight trajectory control with network-oriented hierarchical reinforcement learning for UAVs-assisted data time-sensitive IoT［J］. IEEE Transactions on Intelligent Transportation Systems， 2025， 26（5）： 6332-6345.
[15]	HU Y M， LIU Y， KAUSHIK A， et al. Timely data collection for UAV-based IoT networks： A deep reinforcement learning approach［J］. IEEE Sensors Journal， 2023， 23（11）： 12295-12308.
[16]	BOUHAMED O， GHAZZAI H， BESBES H， et al. A UAV-assisted data collection for wireless sensor networks： Autonomous navigation and scheduling［J］. IEEE Access， 2020， 8： 110446-110460.
[17]	高思华，李军辉，李建伏，等. 面向公平性数据采集和能量补充的无人机路径规划算法研究［J］. 电子学报， 2024， 52（11）： 3699-3710.
	GAO S H， LI J H， LI J F， et al. Research on UAV path planning algorithm for fairness data collection and energy supplement［J］. Acta Electronica Sinica， 2024， 52（11）： 3699-3710 （in Chinese）.
[18]	高思华，刘宝煜，惠康华，等. 信息年龄约束下的无人机数据采集能耗优化路径规划算法［J］. 电子与信息学报， 2024， 46（10）： 4024-4034.
	GAO S H， LIU B Y， HUI K H， et al. Energy-efficient UAV trajectory planning algorithm for AoI-constrained data collection［J］. Journal of Electronics & Information Technology， 2024， 46（10）： 4024-4034 （in Chinese）.
[19]	LI B， TANG H， ZHENG Y， et al. HyAR： Addressing discrete-continuous action reinforcement learning via hybrid action representation［DB/OL］. arXiv preprint： 2109.05490， 2021.
[20]	YU Y， TANG J， HUANG J Y， et al. Multi-objective optimization for UAV-assisted wireless powered IoT networks based on extended DDPG algorithm［J］. IEEE Transactions on Communications， 2021， 69（9）： 6361-6374.
[21]	WEI Z Q， ZHU M Y， ZHANG N， et al. UAV-assisted data collection for Internet of Things： A survey［J］. IEEE Internet of Things Journal， 2022， 9（17）： 15460-15483.
[22]	FU Q Y， JIA R H， LYU F， et al. Collection point matters in time-energy tradeoff for UAV-enabled data collection of IoT devices［J］. IEEE Internet of Things Journal， 2024， 11（19）： 31492-31506.
[23]	GONG H， HUANG B Q， JIA B， et al. Modeling power consumptions for multirotor UAVs［J］. IEEE Transactions on Aerospace and Electronic Systems， 2023， 59（6）： 7409-7422.
[24]	SCHULMAN J， WOLSKI F， DHARIWAL P， et al. Proximal policy optimization algorithms［DB/OL］. arXiv preprint： 1707.06347， 2017.
[25]	LAHMERI M A， KISHK M A， ALOUINI M S. Charging techniques for UAV-assisted data collection： Is laser power beaming the answer？［J］. IEEE Communications Magazine， 2022， 60（5）： 50-56.
[26]	HU H M， XIONG K， QU G， et al. AoI-minimal trajectory planning and data collection in UAV-assisted wireless powered IoT networks［J］. IEEE Internet of Things Journal， 2021， 8（2）： 1211-1223.
[27]	WU M P， SU L J， CHEN J X， et al. Development and prospect of wireless power transfer technology used to power unmanned aerial vehicle［J］. Electronics， 2022， 11（15）： 2297.
[28]	ZHANG Q Q， FANG W， LIU Q W， et al. Distributed laser charging： A wireless power transfer approach［J］. IEEE Internet of Things Journal， 2018， 5（5）： 3853-3864.
[29]	ZENG Y， XU J， ZHANG R. Energy minimization for wireless communication with rotary-wing UAV［J］. IEEE Transactions on Wireless Communications， 2019， 18（4）： 2329-2345.

E-mail：hkxb@buaa.edu.cn

关于我们

期刊社服务

专业学科

封面文章

友情链接

主管单位：中国科学技术协会主办单位：中国航空学会北京航空航天大学

任务规模	时间窗长度/s	H-HyAR		HyAR		PDQN		CHPPO		PPO
任务规模	时间窗长度/s	飞行距离/km	完成率/%	飞行距离/km	完成率/%	飞行距离/km	完成率/%	飞行距离/km	完成率/%	飞行距离/km	完成率/%
60	400	15.36	97.4	16.43	95.6	18.73	94.9	17.74	95.0	19.82	93.5
	600	15.06	97.5	16.24	96.4	17.50	95.2	17.27	96.3	19.98	94.3
	800	14.89	99.3	15.73	99.2	17.29	98.5	17.02	99.1	19.55	98.2
80	400	17.24	93.6	17.40	93.4	19.43	92.5	18.61	92.3
	600	16.51	95.2	17.21	94.9	19.23	94.3	18.11	93.2	20.76	92.7
	800	16.47	96.7	17.05	95.3	19.16	94.6	18.28	95.0	20.11	93.3
100	400	22.16	92.1	23.89	91.7	24.84	89.9	24.75	91.2
	600	20.87	94.2	22.27	93.7	23.28	93.4	22.78	92.4	24.95	91.2
	800	19.74	95.7	20.60	94.6	22.42	93.8	22.38	94.5	23.86	91.5

任务规模	时间窗长度/s	能量补充量/（10⁴ J）
任务规模	时间窗长度/s	H-HyAR	HyAR	PDQN	CHPPO	PPO
60	400	13.70	13.82	14.95	13.83	14.14
	600	11.09	12.02	13.57	12.94	14.50
	800	10.56	10.98	11.10	11.77	14.31
80	400	15.48	15.62	16.53	15.51
	600	14.11	14.79	14.53	14.72	15.75
	800	12.56	12.90	13.02	13.31	15.36
100	400	16.47	16.76	19.27	17.41
	600	14.22	15.62	17.64	15.50	19.48
	800	13.63	14.29	16.70	14.49	18.88

任务规模	激光充电站数量	H-HyAR算法		HyAR算法		PDQN算法		CHPPO算法		PPO算法
任务规模	激光充电站数量	飞行距离/km	完成率/%	飞行距离/km	完成率/%	飞行距离/km	完成率/%	飞行距离/km	完成率/%	飞行距离/km	完成率/%
60	2	17.79	95.9	18.21	94.5	19.12	93.3	18.61	93.7	20.44	92.8
	4	15.36	97.4	16.43	95.6	18.73	94.9	17.74	95.0	19.82	93.5
	6	15.28	97.7	16.37	95.8	18.68	95.2	17.19	95.4	19.60	94.5
80	2	19.25	92.3	19.55	91.5	20.58	90.3	20.42	91.4
	4	17.24	93.6	17.40	93.4	19.43	92.5	18.61	92.3
	6	17.07	94.0	17.21	93.9	19.20	93.1	18.59	92.8	21.95	92.5
100	2	25.11	91.5	25.38	90.6	25.85	89.2	25.72	90.5
	4	22.16	92.1	23.89	91.7	24.84	89.9	24.75	91.2
	6	22.08	92.8	22.35	92.3	24.74	90.8	24.51	91.9	25.73	90.7

基于时间窗约束的无人机完整性数据采集路径规划算法

UAV complete data collection trajectory planning algorithm based on time window constraints

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 16

参考文献 29

相关文章 15

编辑推荐

Metrics

本文评价

[1]	张磊, 田灿, 文方青, 张清河, 刘含. 面向移动边缘网络的多目标进化深度确定性策略梯度算法[J]. 航空学报, 2026, 47(3): 631880-631880.
[2]	万开方, 吴志林, 武韫晖, 强皓植, 吴艺博, 李波. 拒止环境下基于深度强化学习的多无人机协同定位[J]. 航空学报, 2025, 46(8): 331024-331024.
[3]	姜凌峰, 李新凯, 张海, 李涵玮, 张宏立. 基于改进TD3算法的无人机动态环境无地图导航[J]. 航空学报, 2025, 46(8): 331035-331035.
[4]	杨敏, 刘关俊, 周子渊. 基于安全强化学习的月球着陆器控制[J]. 航空学报, 2025, 46(3): 630553-630553.
[5]	谢启超, 曹承钰, 赵逸云, 李繁飙. 基于深度强化学习调参的制导控制一体化方法[J]. 航空学报, 2025, 46(24): 632345-632345.
[6]	王辰, 魏才盛, 殷泽阳, 靳锴, 李星辰. 考虑信道资源约束的多无人机航迹与通信策略协同规划[J]. 航空学报, 2025, 46(18): 331837-331837.
[7]	王昱, 谢志鹏, 田永健, 孟光磊. 虚拟结构引领强化学习分布式无人机编队控制[J]. 航空学报, 2025, 46(15): 331354-331354.
[8]	陈伟, 李璐璐, 陈董, 张少辉, 李亚飞, 王可, 靳远远, 徐明亮. 差异化保障需求驱动的舰载机多机协同决策方法[J]. 航空学报, 2025, 46(13): 531274-531274.
[9]	陈旭东, 陈琦琦, 罗祎喆, 王佳宝, 徐明亮. 异构舰载机舰面保障作业动态并行调度[J]. 航空学报, 2025, 46(13): 531329-531329.
[10]	王政, 王华, 崔可可, 李超超, 刘俊楠, 徐明亮. 局部引导强化学习的舰载机自主调运方法[J]. 航空学报, 2025, 46(13): 531333-531333.
[11]	韩啸华, 韩维, 陆士猛, 李娜, 郭放, 万兵, 苏析超. 面向任务时间窗的舰船直升机群波次出动回收任务规划[J]. 航空学报, 2025, 46(13): 531773-531773.
[12]	凌文辉, 牟春晖, 聂聆聪, 杜宪, 孙希明. 基于改进DDPG的宽速域几何可调燃烧室压力分布控制[J]. 航空学报, 2025, 46(12): 131092-131092.
[13]	余子杰, 郑征, 李清东, 郭林, 任素萍, 郭健. 基于深度强化学习的太阳能无人机航迹规划[J]. 航空学报, 2025, 46(12): 331420-331420.
[14]	高树一, 林德福, 郑多, 徐骋. 考虑拦截器探测能力限制的飞行器智能机动突防制导策略[J]. 航空学报, 2025, 46(10): 331304-331304.
[15]	张鸿林, 罗建军, 马卫华. 基于机器学习的航天器规避目标威胁博弈决策[J]. 航空学报, 2024, 45(8): 329136-329136.