1 |
喻煌超, 牛轶峰, 王祥科. 无人机系统发展阶段和智能化趋势[J]. 国防科技, 2021, 42(3): 18-24.
|
|
YU H C, NIU Y F, WANG X K. Stages of development of Unmanned Aerial Vehicles[J]. National Defense Technology, 2021, 42(3): 18-24 (in Chinese).
|
2 |
ERNEST N, CARROLL D. Genetic fuzzy based artificial intelligence for unmanned combat aerial vehicle control in simulated air combat missions[J]. Journal of Defense Management, 2016, 6(1): 1000144.
|
3 |
POPE A P, IDE J S, MIĆOVIĆ D, et al. Hierarchical reinforcement learning for air-to-air combat[C]∥ 2021 International Conference on Unmanned Aircraft Systems (ICUAS). Piscataway: IEEE Press, 2021: 275-284.
|
4 |
DINARDO G. Artificial intelligence flies XQ-58A Valkyrie drone [EB/OL] (2023-08-03)[2023-12-15]. .
|
5 |
赵志忠, 高正红, 刘行伟, 等. 用攻击点推移速率评估一对一超视距空战效能[J]. 系统仿真学报, 2005, 17(12): 2855-2857, 2862.
|
|
ZHAO Z Z, GAO Z H, LIU X W, et al. Using shooting point stepping pace for evaluating one-versus-one BVR combat effectiveness[J]. Acta Simulata Systematica Sinica, 2005, 17(12): 2855-2857, 2862 (in Chinese).
|
6 |
杜海文, 崔明朗, 韩统, 等. 基于多目标优化与强化学习的空战机动决策[J]. 北京航空航天大学学报, 2018, 44(11): 2247-2256.
|
|
DU H W, CUI M L, HAN T, et al. Maneuvering decision in air combat based on multi-objective optimization and reinforcement learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2018, 44(11): 2247-2256 (in Chinese).
|
7 |
AUSTIN F, CARBONE G, FALCO M, et al. Automated maneuvering decisions for air-to-air combat[C]∥ Proceedings of the Guidance, Navigation and Control Conference. Reston: AIAA, 1987:2393.
|
8 |
ISAACS R. Differential games: A mathematical theory with applications to warfare and pursuit, control and optimization[M]. Mineola: Dover Publications, 1999.
|
9 |
HUANG C Q, DONG K S, HUANG H Q, et al. Autonomous air combat maneuver decision using Bayesian inference and moving horizon optimization[J]. Journal of Systems Engineering and Electronics, 2018, 29(1): 86-97.
|
10 |
BURGIN G H, OWENS A J. An adaptive maneuvering logic computer program for the simulation of one-to-one air-to-air combat. Volume 2: Program description:NASA-CR-2583 [R]. Washington, D. C.:NASA, 1975.
|
11 |
SUN Z X, PIAO H Y, YANG Z, et al. Multi-agent hierarchical policy gradient for Air Combat Tactics emergence via self-play[J]. Engineering Applications of Artificial Intelligence, 2021, 98: 104112.
|
12 |
MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518: 529-533.
|
13 |
SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529: 484-489.
|
14 |
BERNER C, BROCKMAN G, CHAN B, et al. Dota2 with large scale deep reinforcement learning[DB/OL]. arXiv preprint: 1912.06680,2019.
|
15 |
章胜, 周攀, 何扬, 等. 基于深度强化学习的空战机动决策试验[J]. 航空学报, 2023, 44(10): 128094.
|
|
ZHANG S, ZHOU P, HE Y, et al. Air combat maneuver decision-making test based on deep reinforcement learning[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(10): 128094 (in Chinese).
|
16 |
张建东, 王鼎涵, 杨啟明, 等. 基于分层强化学习的无人机空战多维决策[J]. 兵工学报, 2023, 44(6): 1547-1563.
|
|
ZHANG J D, WANG D H, YANG Q M, et al. Multi-dimensional decision-making for UAV air combat based on hierarchical reinforcement learning[J]. Acta Armamentarii, 2023, 44(6): 1547-1563 (in Chinese).
|
17 |
邱妍, 赵宝奇, 邹杰, 等. 基于PPO算法的无人机近距空战自主引导方法[J]. 电光与控制, 2023, 30(1): 8-14.
|
|
QIU Y, ZHAO B Q, ZOU J, et al. An autonomous guidance method of UAV in close air combat based on PPO algorithm[J]. Electronics Optics & Control, 2023, 30(1): 8-14 (in Chinese).
|
18 |
钱殿伟, 齐红敏, 刘振, 等. 基于改进近端策略优化的空战自主决策研究[J/OL]. 系统仿真学报,(2023-07-20)[2024-01-01]. .
|
|
QIAN D W, QI H M, LIU Z, et al. Research on autonomous decision-making in air-combat based on improved proximal policy optimization[J/OL]. Journal of System Simulation,(2023-07-20)[2024-01-01]. (in Chinese).
|
19 |
BARTO A G. Reinforcement learning[M]∥OMIDVAR O, ELLIOTT D L. Neural Systems for Control. Amsterdam: Elsevier, 1997: 7-30.
|
20 |
SUTTON R S, MCALLESTER D, SINGH S, et al. Policy gradient methods for reinforcement learning with function approximation[C]∥ Proceedings of the 12th International Conference on Neural Information Processing Systems. New York: ACM, 1999: 1057–1063.
|
21 |
SCHULMAN J, LEVINE S, MORITZ P, et al. Trust region policy optimization[C]∥Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37. New York:ACM,2015:1889-1897.
|
22 |
SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[DB/OL]. arXiv preprint:1707.06347,2017.
|
23 |
HAARNOJA T, ZHOU A, HARTIKAINEN K, et al. Soft actor-critic algorithms and applications[DB/OL]. arXiv preprint: 1812.05905,2018.
|
24 |
LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[DB/OL]. arXiv preprint :1509.02971, 2015.
|
25 |
FUJIMOTO S, VAN HOOF H, MEGER D. Addressing function approximation error in actor-critic methods[C]∥ Proceedings of the 35th International Conference on Machine Learning,2018: 1587-1596.
|
26 |
SCHULMAN J, MORITZ P, LEVINE S, et al. High-dimensional continuous control using generalized advantage estimation[DB/OL]. arXiv preprint:1506.02438, 2015.
|
27 |
ENGSTROM L, ILYAS A, SANTURKAR S, et al. Implementation matters in deep policy gradients: A case study on PPO and TRPO[DB/OL]. arXiv preprint:2005.12729, 2020.
|
28 |
ZHU J Y, KUANG M C, ZHOU W Q, et al. Mastering air combat game with deep reinforcement learning[J]. Defence Technology, 2024, 34: 295-312.
|
29 |
王宝来,高显忠,谢涛,等.基于强化学习与种群博弈的近距空战决策研究[J/OL].航空学报, (2023-11-02)[2024-01-01]. .
|
|
WANG B L, GAO X Z, XIE T, et al. Research on decision-making in close-range air combat based on reinforcement learning and population game[J/OL]. Acta Aeronautica et Astronautica Sinica,(2023-11-02)[2024-01-01]. (in Chinese).
|
30 |
张婷玉, 孙明玮, 王永帅, 等. 基于深度Q网络的近距空战智能机动决策研究[J]. 航空兵器, 2023, 30(3): 41-48.
|
|
ZHANG T Y, SUN M W, WANG Y S, et al. Research on intelligent maneuvering decision-making in close air combat based on deep Q network[J]. Aero Weaponry, 2023, 30(3): 41-48 (in Chinese).
|
31 |
ZHANG H P, WEI Y J, ZHOU H, et al. Maneuver decision-making for autonomous air combat based on FRE-PPO[J]. Applied Sciences, 2022, 12(20): 10230.
|
32 |
杨晟琦, 田明俊, 司迎利, 等. 基于分层强化学习的无人机机动决策[J]. 火力与指挥控制, 2023, 48(8): 48-52, 59.
|
|
YANG S Q, TIAN M J, SI Y L, et al. Research on UAV maneuver decision-making based on hierarchical reinforcement learning[J]. Fire Control & Command Control, 2023, 48(8): 48-52, 59 (in Chinese).
|
33 |
钟友武, 柳嘉润, 杨凌宇, 等. 自主近距空战中机动动作库及其综合控制系统[J]. 航空学报, 2008, 29(S1): 114-121.
|
|
ZHONG Y W, LIU J R, YANG L Y, et al. Maneuver library and integrated control system for autonomous close-in air combat [J]. Acta Aeronautica et Astronautica Sinica, 2008, 29(S1): 114-121 (in Chinese).
|
34 |
NG A Y, HARADA D, RUSSELL S J. Policy invariance under reward transformations: theory and application to reward shaping[C]∥ Proceedings of the Sixteenth International Conference on Machine Learning. New York: ACM, 1999:278-287.
|
35 |
祝靖宇, 张宏立, 匡敏驰, 等.稀疏奖励下基于课程学习的无人机空战仿真[J].系统仿真学报,2024,36(6):1452-1467.
|
|
ZHU J Y, ZHANG H L, KUANG M C, et al. Curriculum learning based simulation of UAV air combat under sparse rewards[J]. Journal of System Simulation, 2024,36(6):1452-1467 (in Chinese).
|
36 |
周文卿, 朱纪洪, 匡敏驰. 一种基于群体智能的无人空战系统[J]. 中国科学: 信息科学, 2020, 50(3): 363-374.
|
|
ZHOU W Q, ZHU J H, KUANG M C. An unmanned air combat system based on swarm intelligence[J]. Scientia Sinica (Informationis), 2020, 50(3): 363-374 (in Chinese).
|
37 |
FAN Z, SU R, ZHANG W N, et al. Hybrid actor-critic reinforcement learning in parameterized action space[DB/OL]. arXiv preprint: 1903.01344,2019.
|