[1] 孙智孝, 杨晟琦, 朴海音, et al. 未来智能空战发展综述[J]. 航空学报, 2021, 42(08): 35-49.
[2] Getz W M, Pachter M. Two-target pursuit-evasion differen-tial games in the plane[J]. Journal of Optimization Theo-ry and Applications, 1981, 34(3): 383-403.
[3] Geng W X, Kong F, Ma D Q. Study on tactical decision of UAV medium-range air combat[C]//The 26th Chinese Control and Decision Conference (2014 CCDC), 2014: 135-139.
[4] Virtanen K, Raivio T, Hamalainen R P. Modeling Pilot's Sequential Maneuvering Decisions by a Multistage In-fluence Diagram[J]. Journal of Guidance, Control, and Dynamics, 2004, 27(4): 665-677.
[5] Li B, Liang S, Tian L, et al. Intelligent Aircraft Maneuver-ing Decision Based on CNN[C]//Proceedings of the 3rd International Conference on Computer Science and Ap-plication Engineering, 2019: Article 138.
[6] 周攀, 黄江涛, 章胜, et al. 基于深度强化学习的智能空战决策与仿真[J]. 航空学报, 2023, 44(04): 99-112.
[7] 李文韬, 方峰, 王振亚, et al. 引入混合超网络改进MADDPG的双机编队空战自主机动决策-2024高性能无人机专刊[J]. 航空学报, DOI:10.7527/S1000-6893.2023.29460.
[8] 李曾琳, 李波, 白双霞, et al. 基于AM-SAC的无人机自主空战决策[J]. 兵工学报, 2023, 44(09): 2849-2858.
[9] 符小卫, 徐哲, 朱金冬, et al. 基于PER-MATD3的多无人机攻防对抗机动决策[J]. 航空学报, 2023, 44(07): 196-209.
[10] Topin N, Milani S, Fang F, et al. Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2021: 9923-9931
[11] Silva A, Gombolay M, Killian T, et al. Optimization Methods for Interpretable Differentiable Decision Trees Applied to Reinforcement Learning[C]//Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, 2020: 1855-1865.
[12] Landajuela M, Petersen B K, Kim S, et al. Discovering symbolic policies with deep reinforcement learn-ing[C]//Proceedings of the 38th International Conference on Machine Learning, 2021: 5979-5989.
[13] Danesh M H, Koul A, Fern A, et al. Re-understanding Finite-State Representations of Recurrent Policy Net-works[C]//Proceedings of the 38th International Confer-ence on Machine Learning, 2021: 2388-2397.
[14] Greydanus S, Koul A, Dodge J, et al. Visualizing and Understanding Atari Agents[C]//Proceedings of the 35th International Conference on Machine Learning, 2018: 1792-1801.
[15] Bastani O, Pu Y, Solar-Lezama A. Verifiable Reinforce-ment Learning via Policy Extraction[C]//Advances in Neural Information Processing Systems, 2018: 2499-2509.
[16] Tjoa E, Guan C. A Survey on Explainable Artificial Intel-ligence (XAI): Toward Medical XAI[J]. IEEE Transac-tions on Neural Networks and Learning Systems, 2021, 32(11): 4793-4813.
[17] Topin N, Veloso M. Generation of Policy-Level Explana-tions for Reinforcement Learning[DB/OL]. CoRR: abs/1905.12044, 2019.
[18] 高阳阳, 余敏建, 韩其松, et al. 基于改进共生生物搜索算法的空战机动决策[J]. 北京航空航天大学学报, 2019, 45(03): 429-436.
[19] 杜海文, 崔明朗, 韩统, et al. 基于多目标优化与强化学习的空战机动决策[J]. 北京航空航天大学学报, 2018, 44(11): 2247-2256.
[20] 李永丰, 史静平, 章卫国, et al. 深度强化学习的无人作战飞机空战机动决策[J]. 哈尔滨工业大学学报, 2021, 53(12): 33-41.
[21] Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning[DB/OL]. CoRR: abs/1509.02971, 2015.
[22] Guo W, Wu X, Khan U, et al. EDGE: Explaining Deep Reinforcement Learning Policies[C]//Advances in Neural Information Processing Systems, 2021: 12222-12236. |