ACTA AERONAUTICAET ASTRONAUTICA SINICA >
Cooperative location of multiple UAVs with deep reinforcement learning in GPS-denied environment
Received date: 2024-08-01
Revised date: 2024-09-27
Accepted date: 2024-11-21
Online published: 2024-12-05
Supported by
National Nature Science Foundation of China(62003267);the Key Research and Development Program of ShaanXi Province(2023-GHZD-33);the Fundamental Research Funds for the Central Universities(G2022KYO602);the Key Laboratory for Electromagnetic Space Operations and Applications(2022ZX0090);the National Key Laboratory of Air-based Information Perception and Fusion(202471)
In strong adversarial scenarios, Unmanned Aerial Vehicles (UAVs) often experience GPS malfunction due to interference, making it difficult to obtain their accurate position. Since UAVs often operate in formations or clusters, this paper proposes a strategy that relies on UAVs within the formation to measure relative spatial positions and locate each other, allowing UAVs to update their position information in real time even after GPS signal loss. Firstly, in response to the GPS-denied environment, the theory of the Partially Observable Markov Decision Process (POMDP) is introduced and the elements of POMDP are analyzed to establish a POMDP decision model based on collaborative positioning and scheduling is established. A belief state update method based on the Extended Kalman Filter (EKF), as well as a Q-value estimation method based on Deep Q-Network (DQN) in deep reinforcement learning, is proposed to achieve accurate collaborative real-time positioning. Application tests in different scenarios show that the proposed model can achieve efficient management and scheduling of UAVs in formation, and can control GPS normal UAVs to effectively coordinate and locate GPS failed UAVs, which verifies the effectiveness of the model.
Kaifang WAN , Zhilin WU , Yunhui WU , Haozhi QIANG , Yibo WU , Bo LI . Cooperative location of multiple UAVs with deep reinforcement learning in GPS-denied environment[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2025 , 46(8) : 331024 -331024 . DOI: 10.7527/S1000-6893.2024.31024
1 | CHUNG S J, PARANJAPE A A, DAMES P, et al. A survey on aerial swarm robotics[J]. IEEE Transactions on Robotics, 2018, 34(4): 837-855. |
2 | TITTERTON D, WESTON J. Strapdown inertial navigation technology || basic principles of strapdown inertial navigate on systems[M]?∥IEEE Aerospace and Electronic Systems Magazine. Piscataway:IEEE Press, 2004: 17-58. |
3 | 徐玉, 任沁源, 孙文达, 等. 微小型无人直升机地磁导航算法研究[J]. 兵工学报, 2011, 32(3): 6. |
XU Y, REN Q Y, SUN W D, et al. A geomagneic navigation algorithm for miniature unmanned heliope[J]. Acta Armamentarii, 2011, 32(3): 6 (in Chinese). | |
4 | 孔国杰, 冯时, 于会龙, 等. 无人集群系统协同运动规划技术综述[J]. 兵工学报, 2023, 44(1): 11-26. |
KONG G J, FENG S, YU H L, et al. A review on cooperative motion planning of unmanned vehicles[J]. Acta Armamentarii, 2023, 44(1): 11-26 (in Chinese). | |
5 | SHARMA R, TAYLOR C. Vision based distributed cooperative navigation for MAVs in GPS denied areas: AIAA-2009-1932[R]. Reston: AIAA, 2009. |
6 | WYMEERSCH H, LIEN J, WIN M Z. Cooperative localization in wireless networks[J]. Proceedings of the IEEE, 2009, 97(2): 427-450. |
7 | ?AKMAK B, URUP D N, MEYER F, et al. Cooperative localization for mobile networks: a distributed belief propagation-mean field message passing algorithm[J]. IEEE Signal Processing Letters, 2016, 23(6): 828-832. |
8 | VICENTE D, TOMIC S, BEKO M, et al. Performance analysis of a distributed algorithm for target localization in wireless sensor networks using hybrid measurements in a connection failure scenario[C]∥2017 International Young Engineers Forum (YEF-ECE). Piscataway: IEEE Press, 2017. |
9 | CHEN K. Jointed TOA/AOA positioning algorithm for OFDM[J]. Computer Engineering and Applications, 2009, 22(7): 988-992. |
10 | SILVER D, VENESS J. Monte-Carlo planning in large POMDPs[C]∥Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. New York: ACM, 2010. |
11 | MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518: 529-533. |
12 | BISONG E. Building machine learning and deep learning models on Google cloud platform[M]. Berkeley: Apress, 2019: 415-421. |
13 | 李波, 黄晶益, 万开方, 等. 基于深度强化学习的无人机系统应用研究综述[J]. 战术导弹技术, 2023(1): 58-68. |
LI B, HUANG J Y, WAN K F, et al. A review of research on the application of UAV system based on deep reinforcement learning[J]. Tactical Missile Technology, 2023(1): 58-68 (in Chinese). | |
14 | GAO M S, ZHANG X X. Cooperative search method for multiple UAVs based on deep reinforcement learning[J]. Sensors, 2022, 22(18): 6737. |
15 | YANG S Y, YU G Z, MENG Z J, et al. Autonomous obstacle avoidance of UAV based on deep reinforcement learning1[J]. Journal of Intelligent & Fuzzy Systems, 2022, 42(4): 3323-3335. |
16 | DE WITT C S, PENG B, KAMIENNY P A, et al. Deep multi-agent reinforcement learning for decentralized continuous cooperative control[DB/OL]. arXiv: preprint: 2003. 06709; 2003. |
17 | 桂林, 武小悦. 部分可观测马尔可夫决策过程算法综述[J]. 系统工程与电子技术, 2008, 30(6): 1058-1064. |
GUI L, WU X Y. Survey of algorithms for partially observable Markov decision processes[J]. Systems Engineering and Electronics, 2008, 30(6): 1058-1064 (in Chinese). | |
18 | GMYTRASIEWICZ P J, DOSHI P. A framework for sequential planning in multi-agent settings[J]. Journal of Artificial Intelligence Research, 2005, 24: 49-79. |
19 | KAUNE R, H?RST JULIAN, KOCH W. Accuracy analysis for TDOA localization in sensor networks[C]∥14th International Conference on Information Fusion. Piscataway: IEEE Press, 2011. |
20 | BAXTER L A, PUTERMAN M L. Markov decision processes: discrete stochastic dynamic programming[J]. Technometrics, 1995, 37(3): 353. |
21 | SENGIJPTA S K. Fundamentals of statistical signal processing: estimation theory[J]. Technometrics, 1995, 37: 465-466. |
22 | GELMAN A, CARLIN J B B, STERN H S S, et al. Bayesian data analysis[M]. London: Chapman and Hall/CRC, 2015: 138-258. |
23 | 李琳, 张修社, 韩春雷, 等. 基于卡尔曼滤波和DDQN算法的无人机机动目标跟踪[J]. 战术导弹技术, 2022(2): 98-104. |
LI L, ZHANG X S, HAN C L, et al. UAV maneuvering target tracking based on Kalman filter and DDQN algorithm[J]. Tactical Missile Technology, 2022(2): 98-104 (in Chinese). | |
24 | JULIER S J, UHLMANN J K. Corrections to “unscented filtering and nonlinear estimation”[J]. Proceedings of the IEEE, 2004, 92(12): 1958. |
25 | LANGE R J. Bellman filtering and smoothing for state–space models[J]. Journal of Econometrics, 2024, 238(2): 105632. |
26 | 范哲. 反向传播算法浅析[J]. 黑龙江科技信息, 2017(23): 132-133. |
FAN Z. Analysis of back propagation algorithm[J]. Scientific and Technological Innovation, 2017(23): 132-133 (in Chinese). | |
27 | 秦宁宁. 无线传感器网络栅栏覆盖的研究[D]. 无锡: 江南大学, 2008. |
QIN N N. Research on fence coverage in wireless sensor networks[D].Wuxi: Jiangnan University, 2008 (in Chinese). |
/
〈 |
|
〉 |