非完备信息下无人机近距博弈自主决策-2025增刊

  • 周攀 ,
  • 李霓 ,
  • 黄江涛 ,
  • 杨青林 ,
  • 廉云霄
展开
  • 1. 西北工业大学
    2. 中国空气动力研究与发展中心
    3. 北航

收稿日期: 2025-05-09

  修回日期: 2025-06-06

  网络出版日期: 2025-06-10

基金资助

国家自然科学基金

Autonomous decision-making in close-range game under imperfect information for unmanned aerial vehicles

  • ZHOU Pan ,
  • LI Ni ,
  • HUANG Jiang-Tao ,
  • YANG Qing-Lin ,
  • LIAN Yun-Xiao
Expand

Received date: 2025-05-09

  Revised date: 2025-06-06

  Online published: 2025-06-10

摘要

随着计算机科学、自动控制理论、飞行器设计等学科的融合发展,无人机近距博弈自主决策成为当前无人机领域关键性技术难题之一。针对非完备信息下的无人机近距博弈自主决策问题,本文提出了一种基于预训练Efficientero算法的无人机近距博弈自主决策方法。首先,实现了一种基于四元数理论的无人机三自由度动力学模型求解方法,并根据该方法建立了三自由度无人机近距博弈环境模型。其次,基于深度神经网络建立了面向多维连续状态输入、多维离散动作输出的无人机近距博弈自主决策模型。在此基础上,提出了一种基于预训练EfficientZero算法的近距博弈决策模型优化方法。然后,建立了非完备信息下目标机动轨迹预测模型。最后,开展了无人机近距博弈仿真实验。

本文引用格式

周攀 , 李霓 , 黄江涛 , 杨青林 , 廉云霄 . 非完备信息下无人机近距博弈自主决策-2025增刊[J]. 航空学报, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2025.32215

Abstract

With the development of computer science, automatic control theory, aircraft design and other disciplines, UAV close-range game autonomous decision-making has become one of the key technical problems in the field of UAV. Aiming at the autonomous decision-making problem UAV in close-range game under incomplete information, this paper proposes an autonomous decision-making method of UAV in close-range game based on pre-trained Effientero algorithm. Firstly, a three-degree-of-freedom dynamic model of UAV based on quaternion theory is implemented, and a three-degree-of-freedom close-range game environment model of UAV is established according to this method. Secondly, based on deep neural network, an autonomous decision-making model of UAV close-range game for multi-dimensional continuous state input and multi-dimensional discrete action output is established. On this basis, an optimization method of close-range game decision model based on pre-trained EfficientZero algorithm is proposed. Then, the prediction model of target maneuvering trajectory under incomplete information is established. Finally, the close-range game simulation experiment of UAV is carried out.

参考文献

[1]杨伟.关于未来战斗机发展的若干讨论[J].航空学报, 2020, 41(6):8-19 [2]U S. Dod.U. S. Department of Defense Counter-Small Unmanned Aircraft Systems Strategy[R]. Congressional Research Service Reports, 2021-01-07. [3]Arthur H M.Counter-Drone Systems 2nd Edition[R]. Center for the Study of the Drone, 2019. [4]孙昭, 何广军, 李广剑.美军反无人机技术研究[J].飞航导弹, 2021, (11):12-18.DOI:10.16338/j.issn.1009-1319.20210124. [5]王宇, 陈浩, 黄健.有人机/无人机协同系统研究现状与展望[C]//中国指挥与控制学会. 第十届中国指挥控制大会论文集(上册). 2022:12-17.DOI:10.26914/c.cnkihy.2022.019453. [6]严锐驰, 李帅, 王晨, 等.基于自博弈强化学习的异构无人机集群协同对抗决策方法. 中国科学: 信息科学, 2024, 54: 1709–1729, doi: 10.1360/SSI-2023-0267. [7]Hinton G E, Osindero S, Teh Y W.A Fast Learning Algorithm for Deep Belief Nets[J].Neural Computation, 2006, 18(7):1527-1554 [8]Sutton R S, Barto A G.Reinforcement Learning: an Introduction[M]. 2nd Ed. London: MIT Press, 2018. [9]Nguyen N D, Nguyen T, Nahavandi S.System Design Perspective for Human-Level Agents Using Deep Reinforcement Learning: a Survey[J]. IEEE Access, 2017, 5: 27091~27102. [10]Defense Advanced Research Projects Agency.Alphadogfight Trials Go Virtual for Final Event [EB/OL]. (2020-08-07) [2021-03-10]. Https:www.darpa.mil/news-events/2020-08-07. [11]张建东, 王鼎涵, 杨啟明, 等.基于分层强化学习的无人机空战多维决策[J].兵工学报, 2023, 44(6):1547-1563 [12]孟光磊, 刘德见, 周铭哲, 等.近距空战训练中的智能虚拟对手决策与导引方法[J].北京航空航天大学学报, 2022, 48(6):937-949 [13]Liu P, Ma Y.A deep reinforcement learning based intelligent decision method for UCAV air combat[C]//Asian Simulation Conference. Springer, Singapore, 2017:274-286. [14]YANG Q, ZHANG J, SHI G, et al.Maneuver decision of UAV in short-range air combat based on deep reinforcement learning[J]. IEEE Access, 2019, 8: 363-378. [15]周攀, 黄江涛, 章胜等.基于深度强化学习的智能空战决策与仿真研究[J/OL]. 航空学报: 1~16[2023-03-03].http://kns.cnki.net/kcms/detail/11.1929.v.20220126.1120.014.html. [16]LEI X, DALI D, ZHENGLEI W, et al.Moving time UCAV maneuver decision based on the dynamic relational weight algorithm and trajectory prediction[J]. Mathematical Problems in Engineering, 2021, 2021. [17]孙智孝, 杨晟琦, 朴海音, 等.未来智能空战发展综述[J].航空学报, 2021, 42(8):525799- [18]ZAMBALDI V, RAPOSO D, SANTORO A, et al. Relational deep reinforcement learning[J]. arXiv preprint, 2.[J].rXiv:1806.01830., 018,, :- [19]YE W, LIU S, Kurutach T, et al.Mastering atari games with limited data[J]. Advances in neural information processing systems, 2021, 34: 25476-25488. [20]Graves A, Graves A.Long short-term memory[J]. Supervised sequence labelling with recurrent neural networks, 2012: 37-45. [21]Hochreiter S, Schmidhuber J.Long short-term memory[J].Neural computation, 1997, 9(8):1735-1780 [22]Van Houdt G, Mosquera C, Nápoles G.A review on the long short-term memory model[J].Artificial Intelligence Review, 2020, 53(8):5929-5955
文章导航

/