非完备信息下无人机近距博弈自主决策

doi:10.7527/S1000-6893.2025.32215

第二届空天前沿大会/第二十七届中国科协年会优秀论文

本期目录 | 过刊浏览 | 高级检索

前一篇 |

非完备信息下无人机近距博弈自主决策

周攀¹^,², 李霓¹, 黄江涛²(), 杨青林²^,³, 廉云霄¹

^1.西北工业大学航空学院，西安 710072
^2.中国空气动力研究与发展中心空天技术研究所，绵阳 621000
^3.北京航空航天大学航空科学与工程学院，北京 100191

收稿日期:2025-05-09 修回日期:2025-05-12 接受日期:2025-05-18 出版日期:2025-06-11 发布日期:2025-10-30
通讯作者: 黄江涛 E-mail:hjtcyfx@163.com
基金资助:
国家自然科学基金(52372398)

Autonomous decision-making in close-range game under imperfect information for unmanned aerial vehicles

Pan ZHOU¹^,², Ni LI¹, Jiangtao HUANG²(), Qinglin YANG²^,³, Yunxiao LIAN¹

^1.School of Aeronautics，Northwestern Polytechnical University，Xi’an 710072，China
^2.Institute of Space Technology，China Aerodynamics Research and Development Center，Mianyang 621000，China
^3.School of Aeronautic Science and Engineering，Beihang University，Beijing 100191，China

Received:2025-05-09 Revised:2025-05-12 Accepted:2025-05-18 Online:2025-06-11 Published:2025-10-30
Contact: Jiangtao HUANG E-mail:hjtcyfx@163.com
Supported by:
National Natural Science Foundation of China(52372398)

摘要/Abstract

摘要：

随着计算机科学、自动控制理论、飞行器设计等学科的融合发展，无人机近距博弈自主决策成为当前无人机领域关键性技术难题之一。针对非完备信息下的无人机近距博弈自主决策问题，提出了一种基于预训练Efficientero算法的无人机近距博弈自主决策方法。首先，实现了一种基于四元数理论的无人机三自由度动力学模型求解方法，并根据该方法建立了三自由度无人机近距博弈环境模型。其次，基于深度神经网络建立了面向多维连续状态输入、多维离散动作输出的无人机近距博弈自主决策模型。在此基础上，提出了一种基于预训练EfficientZero算法的近距博弈决策模型优化方法。然后，建立了非完备信息下目标机动轨迹预测模型。最后，开展了无人机近距博弈仿真试验。

关键词: 无人机, 非完备信息, 自主决策, 态势预测, 人工智能

Abstract:

With the development of computer science， automatic control theory， aircraft design and other disciplines， autonomous decision-making of Unmanned Aerial Vehicle （UAV） in close-range game has become one of the key technical problems in the field of UAV. Aimed at the autonomous decision-making problem of UAV in close-range game under incomplete information， this paper proposes an autonomous decision-making method of UAV in close-range game based on pre-trained EfficientZero algorithm. Firstly， a three-degree-of-freedom dynamic model of UAV based on quaternion theory is implemented， and a three-degree-of-freedom close-range game environment model of UAV is established according to this method. Secondly， based on deep neural network， an autonomous decision-making model of UAV close-range game for multi-dimensional continuous state input and multi-dimensional discrete action output is established. On this basis， an optimization method of close-range game decision model based on pre-trained EfficientZero algorithm is proposed. Then， the prediction model of target maneuvering trajectory under incomplete information is established. Finally， the close-range game simulation experiment of UAV is carried out.

Key words: unmanned aerial vehicles, imperfect information, autonomous decision-making, situation prediction, artificial intelligence

中图分类号:

V249.12

周攀, 李霓, 黄江涛, 杨青林, 廉云霄. 非完备信息下无人机近距博弈自主决策[J]. 航空学报, 2025, 46(S1): 732215.

Pan ZHOU, Ni LI, Jiangtao HUANG, Qinglin YANG, Yunxiao LIAN. Autonomous decision-making in close-range game under imperfect information for unmanned aerial vehicles[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(S1): 732215.

图/表 19

图 1

表1

无人机控制变量的取值范围

名称	取值范围
迎角/（°）	$- 10,30$
滚转角/（°）	$- 80,80$
油门	$0.2,1 T m a x$

表1

图 2

表2

对抗双方的初始状态

状态变量	红机初始值	蓝机初始值
$x / m$	$0$	$- 5 000,5 000$
$y / m$	$0$	$- 5 000,5 000$
$z / m$	$2 000,10 000$	$2 000,10 000$
$v / (m · s - 1)$	$80,200$	$80,200$
$χ / r a d$	$- π, π$	$- π, π$
$γ / r a d$	$0$	$0$

表2

图 3

图 4

图 5

图 6

图 7

图 8

图 9

图 10

图 11

图 12

图 13

图 14

图 15

图 16

图 17

参考文献 23

[1]	杨伟. 关于未来战斗机发展的若干讨论［J］. 航空学报， 2020， 41（6）： 524377.
	YANG W. Development of future fighters［J］. Acta Aeronautica et Astronautica Sinica， 2020， 41（6）： 524377 （in Chinese）.
[2]	ARTHUR H M. Counter-drone systems［R］. 2nd Edition. Center for the Study of the Drone， 2019.
[3]	孙昭，何广军，李广剑. 美军反无人机技术研究［J］. 飞航导弹， 2021（11）： 12-18.
	SUN Z， HE G J， LI G J. Research on US army’s anti-UAV technology［J］. Aerodynamic Missile Journal， 2021（11）： 12-18 （in Chinese）.
[4]	王宇，陈浩，黄健. 有人机/无人机协同系统研究现状与展望［C］∥. 第十届中国指挥控制大会论文集（上册）. 北京：兵器工业出版社，2022： 12-17.
	WANG Y， CHEN H， HUANG J. Research status and prospects of collaborative systems between drones and aerial vehicles ［C］∥. Proceedings of the 10th China Command and Control Conference （Volume 1）.Beijing：The Publishing House of Ordnance Industry， 2022： 12-17 （in Chinese）.
[5]	严锐驰，李帅，王晨，等. 基于自博弈强化学习的异构无人机集群协同对抗决策方法［J］. 中国科学：信息科学， 2024， 54（7）： 1709-1729.
	YAN R C， LI S， WANG C， et al. Cooperative decision-making for heterogeneous UAV swarm confrontation based on self-play reinforcement learning［J］. Scientia Sinica （Informationis）， 2024， 54（7）： 1709-1729 （in Chinese）.
[6]	HINTON G E， OSINDERO S， TEH Y W. A fast learning algorithm for deep belief nets［J］. Neural Computation， 2006， 18（7）： 1527-1554.
[7]	SUTTON R S， BARTO A G. Reinforcement learning： An introduction［M］. 2nd Ed. London： MIT Press， 2018.
[8]	NGUYEN N D， NGUYEN T， NAHAVANDI S. System design perspective for human-level agents using deep reinforcement learning： A survey［J］. IEEE Access， 2017， 5： 27091-27102.
[9]	POPE A P， IDE J S， MIĆOVIĆ D， et al. Hierarchical reinforcement learning for air combat at DARPA’s AlphaDogfight trials［J］. IEEE Transactions on Artificial Intelligence， 2022， 4（6）： 1371-1385.
[10]	孟光磊，刘德见，周铭哲，等. 近距空战训练中的智能虚拟对手决策与导引方法［J］. 北京航空航天大学学报， 2022， 48（6）： 937-949.
	MENG G L， LIU D J， ZHOU M Z， et al. Intelligent virtual opponent decision making and guidance method in short-range air combat training［J］. Journal of Beijing University of Aeronautics and Astronautics， 2022， 48（6）： 937-949 （in Chinese）.
[11]	LIU P， MA Y F. A deep reinforcement learning based intelligent decision method for UCAV air combat［M］∥ Modeling， Design and Simulation of Systems. Singapore： Springer Singapore， 2017： 274-286.
[12]	YANG Q M， ZHANG J D， SHI G Q， et al. Maneuver decision of UAV in short-range air combat based on deep reinforcement learning［J］. IEEE Access， 2019， 8： 363-378.
[13]	周攀，黄江涛，章胜，等. 基于深度强化学习的智能空战决策与仿真［J］. 航空学报， 2023， 44（4）： 126731.
	ZHOU P， HUANG J T， ZHANG S， et al. Intelligent air combat decision making and simulation based on deep reinforcement learning［J］. Acta Aeronautica et Astronautica Sinica， 2023， 44（4）： 126731 （in Chinese）.
[14]	XIE L， DING D L， WEI Z L， et al. Moving time UCAV maneuver decision based on the dynamic relational weight algorithm and trajectory prediction［J］. Mathematical Problems in Engineering， 2021， 2021（1）： 6641567.
[15]	王宝来，高显忠，谢涛，等. 基于强化学习与种群博弈的近距空战决策［J］. 航空学报， 2024， 45（12）： 329466.
	WANG B L， GAO X Z， XIE T， et al. Decision-making in close-range air combat based on reinforcement learning and population game［J］. Acta Aeronautica et Astronautica Sinica， 2024， 45（12）： 329466 （in Chinese）.
[16]	李恒晖，林前辉，韩涛锋，等. 基于能量机动的近距空战模型及应用［J］. 航空学报， 2025， 46（7）： 330863.
	LI H H， LIN Q H， HAN T F， et al. Close-range air combat model based on energy maneuverability and its applications［J］. Acta Aeronautica et Astronautica Sinica， 2025， 46（7）： 330863 （in Chinese）.
[17]	杨书恒，张栋，熊威，等. 基于可解释性强化学习的空战机动决策方法［J］. 航空学报， 2024， 45（18）： 329922.
	YANG S H， ZHANG D， XIONG W， et al. Decision-making method for air combat maneuver based on explainable reinforcement learning［J］. Acta Aeronautica et Astronautica Sinica， 2024， 45（18）： 329922 （in Chinese）.
[18]	孙智孝，杨晟琦，朴海音，等. 未来智能空战发展综述［J］. 航空学报， 2021， 42（8）： 525799.
	SUN Z X， YANG S Q， PIAO H Y， et al. A survey of air combat artificial intelligence［J］. Acta Aeronautica et Astronautica Sinica， 2021， 42（8）： 525799 （in Chinese）.
[19]	ZAMBALDI V， RAPOSO D， SANTORO A， et al. Relational deep reinforcement learning［J］. arXiv preprint， arXiv：， 2018.
[20]	YE W， LIU S， KURUTACH T， et al. Mastering atari games with limited data［J］. Advances in Neural Information Processing Systems， 2021， 34： 25476-25488.
[21]	GRAVES A. Long short-term memory［M］∥Supervised Sequence Labelling with Recurrent Neural Networks. Berlin： Springer， 2012： 37-45.
[22]	HOCHREITER S， SCHMIDHUBER J. Long short-term memory［J］. Neural Computation， 1997， 9（8）： 1735-1780.
[23]	VAN HOUDT G， MOSQUERA C， NÁPOLES G. A review on the long short-term memory model［J］. Artificial Intelligence Review， 2020， 53（8）： 5929-5955.

E-mail：hkxb@buaa.edu.cn

关于我们

期刊社服务

专业学科

封面文章

友情链接

主管单位：中国科学技术协会主办单位：中国航空学会北京航空航天大学

非完备信息下无人机近距博弈自主决策

Autonomous decision-making in close-range game under imperfect information for unmanned aerial vehicles

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 19

参考文献 23

相关文章 15

编辑推荐

Metrics

本文评价

[1]	贺炅, 任斌武, 杜思亮, 徐尤松, 王博. 基于ADRC-RBF倾转四旋翼无人机姿态自适应控制[J]. 航空学报, 2025, 46(S1): 732189-732189.
[2]	虞翔宇, 李文, 严杰, 梁世哲. 无人机液氢燃料电池热管理系统仿真研究[J]. 航空学报, 2025, 46(9): 630964-630964.
[3]	杨芃芊, 陈禹彤, 刘俊辉, 杨杰豪, 单家元, 孙士珺. 串列翼货运无人机大攻角气动与操稳特性[J]. 航空学报, 2025, 46(9): 131056-131056.
[4]	李荣祖, 刘莉, 杨盾. 基于多源域融合代理模型的氢能无人机优化设计[J]. 航空学报, 2025, 46(9): 630979-630979.
[5]	万开方, 吴志林, 武韫晖, 强皓植, 吴艺博, 李波. 拒止环境下基于深度强化学习的多无人机协同定位[J]. 航空学报, 2025, 46(8): 331024-331024.
[6]	姜凌峰, 李新凯, 张海, 李涵玮, 张宏立. 基于改进TD3算法的无人机动态环境无地图导航[J]. 航空学报, 2025, 46(8): 331035-331035.
[7]	吴光辉, 王景, 谢海润, 马涂亮, 苗强, 向纪鑫, 张淼. 数据与知识联合赋能的民机智能气动设计[J]. 航空学报, 2025, 46(5): 531485-531485.
[8]	向锦武, 马凯, 阚梓, 李道春, 郑可欣, 陈汉轩. 氢能源无人机关键技术研究进展[J]. 航空学报, 2025, 46(5): 531603-531603.
[9]	丁奇帅, 雷帮军, 吴正平. 基于孪生网络的轻量型无人机单目标跟踪算法[J]. 航空学报, 2025, 46(4): 330925-330925.
[10]	吴付杰, 王博文, 齐静雅, 曹铭智, 桑英俊, 李晟, 张玉珍, 陈钱, 左超. 机载多孔径全景图像合成技术研究进展[J]. 航空学报, 2025, 46(3): 630505-630505.
[11]	马诺, 卫社春, 孟军辉, 刘清洋, 雷宇声. 考虑减速伞作用的无人机内埋舱体分离流场特性与动力学[J]. 航空学报, 2025, 46(3): 130755-130755.
[12]	吴一全, 童康. 基于深度学习的无人机航拍图像小目标检测研究进展[J]. 航空学报, 2025, 46(3): 30848-030848.
[13]	张安平, 董昊. 应对高端战争的无人机蜂群及其起飞方式[J]. 航空学报, 2025, 46(22): 331034-331034.
[14]	宋亚航, 张鑫, 马志明, 左峥瑜. 翼型阵风减缓等离子体流动控制低速风洞试验[J]. 航空学报, 2025, 46(22): 131975-131975.
[15]	宋怡成, 齐瑞云, 姜斌. 通信故障下无人机编队网络分布式拓扑重构[J]. 航空学报, 2025, 46(22): 331914-331914.