Unmanned Aerial Vehicle (UAV) has been widely adopted to assist Wireless Sensor Networks (WSNs) in performing data collection tasks. However, time window constraints at the sensor nodes present new challenges. The UAV must not only ar-rive in the vicinity of each data-uploading node within its designated time window, but also complete the data collection task before the window closes. Inefficient trajectory planning increases the UAV’s flight distance, which may compromise the completeness of data collection. Although increasing flight speed can shorten travel time, it also accelerates energy depletion, potentially leading to task failure. To address these problems, We formulate a mathematical model for the UAV trajectory planning problem in a time-window-constrained complete data collection scenario, and then propose a reinforcement learn-ing framework based on a Hierarchical Hybrid Action Representation (H-HyAR) to jointly optimize the UAV's visiting order of target nodes, hovering offset, and flight speed, while capturing the hierarchical dependencies among these factors to min-imize the UAV’s flight distance during the data collection task. Numerous experiment results demonstrate that the H-HyAR algorithm outperforms three comparative hybrid action reinforcement learning algorithms and the Proximal Policy Optimiza-tion (PPO) algorithm in terms of flight distance and the influencing factors of this metric. Moreover, H-HyAR algorithm ex-hibits strong robustness and generalization capabilities.
[1]YANG M, BI W, ZHANG A, et al.A distributed task reassignment method in dynamic environment for multi-UAV system[J].Applied Intelligence, 2022, 52(2):1582-1601
[2]SAMIR M, SHARAFEDDINE S, ASSI C M, et al.UAV trajectory planning for data collection from time-constrained IoT devices[J].IEEE Transactions on Wireless Communications, 2019, 19(1):34-46
[3]CUI W, LI R, FENG Y, et al.Distributed task alloca-tion for a multi-UAV system with time window constraints[J].Drones, 2022, 6(9):-
[4]WAN P, WANG S, XU G, et al.Hybrid heuristic-based multi-UAV route planning for time-dependent data collection[J].Internet of Things Journal, 2024, :-
[5]CHAPNEVIS A, BULUT E.Time-efficient approxi-mate trajectory planning for AoI-centered multi-UAV IoT networks[J].Internet of Things, 2025, :-
[6]张薇,何若俊.面向物联网数据收集的无人机自主路径规划[J].航空学报, 2024, 45(8):329054-
[7]KUO H A, SHEU J P, VAN Cuong N.Profit Maximi-zation for UAV Trajectory Planning in Time-Constrained Data Collection[C]//ICC 2023-IEEE In-ternational Conference on Communications. IEEE, 2023: 5413-5418.
[8]LIU K, ZHENG J.UAV trajectory planning with inter-ference awareness in UAV-enabled time-constrained data collection systems[J].IEEE Transactions on Ve-hicular Technology, 2023, 73(2):2799-2815
[9]LIAU Y S, HONG Y W P, SHEU J P.Laser-Powered UAV Trajectory and Charging Optimization for Sus-tainable Data-Gathering in the Internet of Things[J].IEEE Transactions on Mobile Computing, 2024, :-
[10]LUO C, LIU N, HOU Y, et al.Trajectory optimization of laser-charged UAV to minimize the average age of information for wireless rechargeable sensor network[J].heoretical Computer Science, 2023, :-
[11]乌兰, 刘全, 黄志刚, 等.离线强化学习研究综述[J].计算机学报, 2025, 48(01):156-187
[12]WAN P, XU G, CHEN J, et al.Deep reinforcement learning enabled multi-UAV scheduling for disaster data collection with time-varying value[J].IEEE Transactions on Intelligent Transportation Systems, 2024, 25(7):6691-6702
[13]BOUHAMED O, GHAZZAI H, BESBES H, et al.A UAV-assisted data collection for wireless sensor net-works: Autonomous navigation and scheduling[J].IEEE Access, 2020, :-
[14]CAI M C, FAN S C, XIAO G Q, et al.Deep Rein-forcement Learning-Based UAV Path Planning Algo-rithm in Agricultural Time-Constrained Data Collection[J].Advances in Electrical & Computer Engineering, 2023, :-
[15]WANG J, LI S, CHEN D, et al.Flight Trajectory Con-trol With Network-Oriented Hierarchical Reinforce-ment Learning for UAVs-Assisted Data Time-Sensitive IoT[J].IEEE Transactions on Intelligent Transportation Systems, 2025, :-
[16]HU Y, LIU Y, KAUSHIK A, et al.Timely data collec-tion for UAV-based IoT networks: A deep reinforce-ment learning approach[J].IEEE Sensors Journal, 2023, 23(11):12295-12308
[17]高思华, 李军辉, 李建伏, 等.面向公平性数据采集和能量补充的无人机路径规划算法研究[J].电子学报, 2024, 52(11):3699-3710
[18]高思华, 刘宝煜, 惠康华, 等.信息年龄约束下的无人机数据采集能耗优化路径规划算法[J].电子与信息学报, 2024, 46(10):4024-4034
[19]LI B, TANG H, ZHENG Y, et al.Hyar: Addressing discrete-continuous action reinforcement learning via hybrid action representation[J].arXiv:2109.05490, 2021., rint, :-
[20]YU Y, TANG J, HUANG J, et al.Multi-objective op-timization for UAV-assisted wireless powered IoT networks based on extended DDPG algorithm[J].IEEE Transactions on Communications, 2021, 69(9):6361-6374
[21]GONG H, HUANG B, JIA B, et al.Modeling power consumptions for multirotor UAVs[J].IEEE Transac-tions on Aerospace and Electronic Systems, 2023, 59(6):7409-7422
[22]SCHULMAN J, WOLSKI F, DHARIWAL P, et al..Proximal policy optimization algorithms[J].arXiv:1707.06347, 2017
[23]LAHMERI M A, KISHK M A, ALOUINI M S.Charg-ing techniques for UAV-assisted data collection: Is la-ser power beaming the answer?[J].IEEE Communica-tions Magazine, 2022, 60(5):50-56
[24]HU H, XIONG K, QU G, et al.AoI-minimal trajectory planning and data collection in UAV-assisted wireless powered IoT networks[J].IEEE Internet of Things Journal, 2020, 8(2):1211-1223
[25]WU M, SU L, CHEN J, et al.Development and pro-spect of wireless power transfer technology used to power unmanned aerial vehicle[J].Electronics, 2022, 11(15):2297-
[26]ZHANG Q, FANG W, LIU Q, et al.Distributed laser charging: A wireless power transfer approach[J].IEEE Internet of Things Journal, 2018, 5(5):3853-3864