航空学报 > 2026, Vol. 47 Issue (6): 332451-332451   doi: 10.7527/S1000-6893.2025.32451

基于时间窗约束的无人机完整性数据采集路径规划算法

高思华, 赵炳阳, 李建伏()   

  1. 中国民航大学 计算机科学与技术学院,天津 300300
  • 收稿日期:2025-06-20 修回日期:2025-07-04 接受日期:2025-08-25 出版日期:2025-09-09 发布日期:2025-09-05
  • 通讯作者: 李建伏 E-mail:jfli@cauc.edu.cn
  • 基金资助:
    国家自然科学基金(62173332)

UAV complete data collection trajectory planning algorithm based on time window constraints

Sihua GAO, Bingyang ZHAO, Jianfu LI()   

  1. College of Computer Science and Technology,Civil Aviation University of China,Tianjin 300300,China
  • Received:2025-06-20 Revised:2025-07-04 Accepted:2025-08-25 Online:2025-09-09 Published:2025-09-05
  • Contact: Jianfu LI E-mail:jfli@cauc.edu.cn
  • Supported by:
    National Natural Science Foundation of China(62173332)

摘要:

无人机(UAV)已广泛应用于辅助无线传感器网络(WSNs)完成数据采集任务。然而,节点的时间窗约束给其带来了新的挑战,无人机不仅需要在特定时间窗内飞行至各待上传数据的节点周围,还必须在节点时间窗关闭前完成数据采集任务。不合理的路径规划会导致无人机飞行距离增加,无法保障数据的完整性采集。虽然提升飞行速度可缩短飞行时间,但无人机能量消耗过快易导致数据采集任务失败。为了解决以上问题,面向部署激光充电站的无线传感器网络数据采集场景,提出了基于时间窗约束的无人机完整性数据采集路径规划问题并进行数学建模。设计了一种基于混合动作层次表示模型的强化学习框架(H-HyAR),联合优化无人机对目标节点的访问次序、悬停偏移和飞行速度,并挖掘三者间的层次依赖关系,从而最小化无人机在数据采集任务中的飞行距离。仿真实验结果表明,H-HyAR算法在无人机飞行距离以及影响该指标因素的对比实验中的表现均优于其他3种混合动作强化学习算法和近端策略优化(PPO)算法,且具有良好的鲁棒性和泛化能力。

关键词: 无人机路径规划, 混合动作层次表示模型, 深度强化学习, 时间窗, 完整性数据采集, 无线传感器网络

Abstract:

Unmanned Aerial Vehicle (UAV) has been widely adopted to assist Wireless Sensor Networks (WSNs) in performing data collection tasks. However, time window constraints at the sensor nodes pose new challenges. The UAV must not only arrive in the vicinity of each data-transmitting node within its designated time window, but also complete the data collection task before the window closes. Inefficient trajectory planning increases the UAV’s flight distance, which may compromise the completeness of data collection. Although increasing flight speed can shorten travel time, it also accelerates energy consumption, potentially leading to task failure. To address these problems, We formulate a mathematical model for the UAV trajectory planning problem in a time-window-constrained complete data collection scenario, and then propose a reinforcement learning framework based on a Hierarchical Hybrid Action Representation (H-HyAR) to jointly optimize the UAV’s visiting order of target nodes, hovering offset, and flight speed, while capturing the hierarchical dependencies among these factors to minimize the UAV’s flight distance during the data collection task. Experiment results demonstrate that the H-HyAR algorithm outperforms three comparative hybrid action reinforcement learning algorithms and the Proximal Policy Optimization (PPO) algorithm in terms of flight distance and the influencing factors of this metric, while also exhibiting strong robustness and generalization capabilities.

Key words: unmanned aerial vehicle trajectory planning, hierarchical hybrid action representation, deep reinforcement learning, time window, complete data collection, wireless sensor networks

中图分类号: