Acta Aeronautica et Astronautica Sinica ›› 2023, Vol. 44 ›› Issue (22): 628871.doi: 10.7527/S1000-6893.2023.28871
• special column • Previous Articles Next Articles
Yupeng FU1, Xiangyang DENG1,2(
), Ziqiang ZHU1, Limin ZHANG1
Received:2023-04-14
Revised:2023-05-30
Accepted:2023-06-14
Online:2023-11-25
Published:2023-06-27
Contact:
Xiangyang DENG
E-mail:skl18@mails.tsinghua.edu.cn
Supported by:CLC Number:
Yupeng FU, Xiangyang DENG, Ziqiang ZHU, Limin ZHANG. Value-filter based air-combat maneuvering optimization[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(22): 628871.
| 1 | WANG Z A, LI H, WU H L, et al. Improving maneuver strategy in air combat by alternate freeze games with a deep reinforcement learning algorithm[J]. Mathematical Problems in Engineering, 2020, 2020: 1-17. |
| 2 | 马文, 李辉, 王壮, 等. 基于深度随机博弈的近距空战机动决策[J]. 系统工程与电子技术, 2021, 43(2): 443-451. |
| MA W, LI H, WANG Z, et al. Close air combat maneuver decision based on deep stochastic game[J]. Systems Engineering and Electronics, 2021, 43(2): 443-451 (in Chinese). | |
| 3 | 李宪港, 李强. 典型智能博弈系统技术分析及指控系统智能化发展展望[J]. 智能科学与技术学报, 2020, 2(1): 36-42. |
| LI X G, LI Q. Technical analysis of typical intelligent game system and development prospect of intelligent command and control system[J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(1): 36-42 (in Chinese). | |
| 4 | POPE A P, IDE J S, MIĆOVIĆ D, et al. Hierarchical reinforcement learning for air-to-air combat[C]∥2021 International Conference on Unmanned Aircraft Systems (ICUAS). Piscataway: IEEE Press, 2021: 275-284. |
| 5 | SUFIYAN D, WIN L T S, WIN S K H, et al. A reinforcement learning approach for control of a nature-inspired aerial vehicle[C]∥2019 International Conference on Robotics and Automation (ICRA). Piscataway: IEEE Press, 2019: 6030-6036. |
| 6 | ZHEN Y, HAO M R, SUN W D. Deep reinforcement learning attitude control of fixed-wing UAVs[C]∥2020 3rd International Conference on Unmanned Systems (ICUS). Piscataway: IEEE Press, 2020: 239-244. |
| 7 | WANG C, YAN C, XIANG X, et al. A continuous actor-critic reinforcement learning approach to flocking with fixed-wing UAVs[C]∥Asian Conference on Machine Learning. Berlin: Springer, 2020: 239-244. |
| 8 | 周攀, 黄江涛, 章胜, 等. 基于深度强化学习的智能空战决策与仿真[J]. 航空学报, 2023, 44(4): 126731. |
| ZHOU P, HUANG J T, ZHANG S, et al. Intelligent air combat decision making and simulation based on deep reinforcement learning[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(4): 126731 (in Chinese). | |
| 9 | 吴宜珈, 赖俊, 陈希亮, 等. 强化学习算法在超视距空战辅助决策上的应用研究[J]. 航空兵器, 2021, 28(2): 55-61. |
| WU Y J, LAI J, CHEN X L, et al. Research on the application of reinforcement learning algorithm in decision support of beyond-visual-range air combat[J]. Aero Weaponry, 2021, 28(2): 55-61 (in Chinese). | |
| 10 | 王欢, 周旭, 邓亦敏, 等. 分层决策多机空战对抗方法[J]. 中国科学: 信息科学, 2022, 52(12): 2225-2238. |
| WANG H, ZHOU X, DENG Y M, et al. A hierarchical decision-making method for multi-aircraft air combat confrontation[J]. Scientia Sinica (Informationis), 2022, 52(12): 2225-2238 (in Chinese). | |
| 11 | POMERLEAU D A. Alvinn: An autonomous land vehicle in a neural network[C]∥Conference and Workshop on Neural Information Processing Systems. New York: ACM, 1989: 305-313. |
| 12 | BOJARSKI M, DEL TESTA D, DWORAKOWSKI D, et al. End to end learning for self-driving cars[DB/OL]. arXiv preprint: 1604.07316. 2016. |
| 13 | GIUSTI A, GUZZI J, CIREŞAN D C, et al. A machine learning approach to visual perception of forest trails for mobile robots[J]. IEEE Robotics and Automation Letters, 2016, 1(2): 661-667. |
| 14 | NAKANISHI J, MORIMOTO J, ENDO G, et al. Learning from demonstration and adaptation of biped locomotion[J]. Robotics and Autonomous Systems, 2004, 47(2-3): 79-91. |
| 15 | ROSS S, GORDON G J, BAGNELL J A. A reduction of imitation learning and structured prediction to No-regret online learning[C]∥Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. New York: PMLR, 2011: 627–635. |
| 16 | NG A Y, RUSSELL S J. Algorithms for inverse reinforcement learning[C]∥Proceedings of the Seventeenth International Conference on Machine Learning. New York: ACM, 2000: 663-670. |
| 17 | ZIEBART B D, MAAS A, BAGNELL J A, et al. Maximum entropy inverse reinforcement learning[C]∥ Proceedings of the 23rd National Conference on Artificial Intelligence. New York: ACM, 2008: 1433-1438. |
| 18 | FINN C, LEVINE S, ABBEEL P. Guided cost learning: Deep inverse optimal control via policy optimization[C]∥Proceedings of the 33rd International Conference on International Conference on Machine Learning. New York: ACM, 2016: 49-58. |
| 19 | NAIR A, MCGREW B, ANDRYCHOWICZ M, et al. Overcoming exploration in reinforcement learning with demonstrations[C]∥2018 IEEE International Conference on Robotics and Automation (ICRA). Piscataway: IEEE Press, 2018: 6292-6299. |
| 20 | XU H R, ZHAN X Y, YIN H L, et al. Discriminator-weighted offline imitation learning from suboptimal demonstrations[C]∥Proceedings of the 39th International Conference on Machine Learning. New York: ACM, 2022: 24725-24742. |
| 21 | VINYALS O, BABUSCHKIN I, CZARNECKI W M, et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning[J]. Nature, 2019, 575(7782): 350-354. |
| 22 | WANG P, LIU D P, CHEN J Y, et al. Decision making for autonomous driving via augmented adversarial inverse reinforcement learning[C]∥2021 IEEE International Conference on Robotics and Automation (ICRA). Piscataway: IEEE Press, 2021: 1036-1042. |
| 23 | 俞扬, 詹德川, 周志华, 等. 基于模仿学习和强化学习算法的无人机飞行控制方法: CN112162564B[P]. 2021-09-28. |
| YU Y, ZHAN D C, ZHOU Z H, et al. Unmanned aerial vehicle flight control method based on imitation learning and reinforcement learning algorithms: CN112162564B[P]. 2021-09-28 (in Chinese). | |
| 24 | ZHU Z D, LIN K X, DAI B, et al. Self-adaptive imitation learning: Learning tasks with delayed rewards from sub-optimal demonstrations[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(8): 9269-9277. |
| 25 | SCHULMAN J, LEVINE S, MORITZ P, et al. Trust region policy optimization[C]∥International Conference on Machine Learning. New York: ACM, 2015: 1889-1897. |
| 26 | SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[DB/OL]. arXiv preprint: 1707.06347, 2017. |
| 27 | 李茹杨, 彭慧民, 李仁刚, 等. 强化学习算法与应用综述[J]. 计算机系统应用, 2020, 29(12): 13-25. |
| LI R Y, PENG H M, LI R G, et al. Overview on algorithms and applications for reinforcement learning[J]. Computer Systems and Applications, 2020, 29(12): 13-25 (in Chinese). | |
| 28 | OH J, GUO Y, SINGH S, et al. Self-imitation learning [C]∥Proceedings of the 35th International Conference on Machine Learning. New York: ACM, 2018: 3778-3887. |
| 29 | HAARNOJA T, TANG H R, ABBEEL P, et al. Reinforcement learning with deep energy-based policies[C]∥Proceedings of the 34th International Conference on Machine Learning-Volume 70. New York: ACM, 2017: 1352-1361. |
| 30 | LI C, WU F G, ZHAO J S. Accelerating self-imitation learning from demonstrations via policy constraints and Q-ensemble[C]∥2023 International Joint Conference on Neural Networks (IJCNN). Piscataway: IEEE Press, 2023: 1-8. |
| 31 | SCHULMAN J, MORITZ P, LEVINE S, et al. High-dimensional continuous control using generalized advantage estimation[DB/OL]. arXiv preprint: 1506.02438, 2015. |
| 32 | KINGMA D P and BA J. Adam: A method for stochastic optimization[C]∥International Conference for Learning Representations (ICLR). San Juan: Puerto Rico, 2015. |
| 33 | MCGREW J S, HOW J P, WILLIAMS B, et al. Air-combat strategy using approximate dynamic programming[J]. Journal of Guidance, Control, and Dynamics, 2010, 33(5): 1641-1654. |
| 34 | Fujimoto S, Gu S S. A minimalist approach to offline reinforcement learning[C]∥Thirty-Fifth Conference on Neural Information Processing Systems (NeurIPS). New York: ACM, 2021: 20132-20145. |
| [1] | Kaifang WAN, Zhilin WU, Yunhui WU, Haozhi QIANG, Yibo WU, Bo LI. Cooperative location of multiple UAVs with deep reinforcement learning in GPS-denied environment [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(8): 331024-331024. |
| [2] | Lingfeng JIANG, Xinkai LI, Hai ZHANG, Hanwei LI, Hongli ZHANG. Mapless navigation of UAVs in dynamic environments based on an improved TD3 algorithm [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(8): 331035-331035. |
| [3] | Min YANG, Guanjun LIU, Ziyuan ZHOU. Control of lunar landers based on secure reinforcement learning [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(3): 630553-630553. |
| [4] | Chen WANG, Caisheng WEI, Zeyang YIN, Kai JIN, Xingchen LI. Collaborative planning of multi-UAV trajectories and communication strategies considering channel resource constraints [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(18): 331837-331837. |
| [5] | Yizhe LUO, Hui ZHANG, Xinde YU, Zhao JIN, Shuo FENG, Yucheng SHI, Mingling XU. Hierarchical dynamic scheduling for multi-wave carrier-based aircraft ammunition support missions [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(18): 331945-331945. |
| [6] | Xiangsong HUANG, Mengyu WANG, Dapeng PAN. Adversarial reinforcement learning-based UAV escape path planning method [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(17): 331637-331637. |
| [7] | Yu WANG, Zhipeng XIE, Yongjian TIAN, Guanglei MENG. Distributed UAV formation control with virtual structure guided reinforcement learning [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(15): 331354-331354. |
| [8] | Wei CHEN, Lulu LI, Dong CHEN, Shaohui ZHANG, Yafei LI, Ke WANG, Yuanyuan JIN, Mingliang XU. Multi-aircraft cooperative decision-making methods driven by differentiated support demands for carrier-based aircraft [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(13): 531274-531274. |
| [9] | Xudong CHEN, Qiqi CHEN, Yizhe LUO, Jiabao WANG, Mingliang XU. Dynamic parallel scheduling of heterogeneous carrier-based aircraft deck support operations [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(13): 531329-531329. |
| [10] | Zheng WANG, Hua WANG, Keke CUI, Chaochao LI, Junnan LIU, Mingliang XU. Locally guided reinforcement learning for autonomous dispatching of carrier-based aircraft [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(13): 531333-531333. |
| [11] | Wenhui LING, Chunhui MU, Lingcong NIE, Xian DU, Ximing SUN. Improved DDPG-based multipoint pressure distribution control of variable geometry scramjet combustor at wide range velocities [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(12): 131092-131092. |
| [12] | Zijie YU, Zheng ZHENG, Qingdong LI, Lin GUO, Suping REN, Jian GUO. Trajectory planning for solar-powered UAVs based on deep reinforcement learning [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(12): 331420-331420. |
| [13] | Changxiao ZHAO, Yixuan SUN. A safe scheduling model for eVTOL avionics systems for airworthiness requirements [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(11): 531252-531252. |
| [14] | Shuyi GAO, Defu LIN, Duo ZHENG, Cheng XU. Intelligent maneuvering penetration guidance strategies for aerial vehicles considering interceptor detection capability limitations [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(10): 331304-331304. |
| [15] | Guang LIU, Hua WANG, Youfang LIN, Shuo HE, Yafei LI, Mingliang XU. Adaptive batch matching decision method for carrier-based aircraft support operations [J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(1): 330615-330615. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||
Address: No.238, Baiyan Buiding, Beisihuan Zhonglu Road, Haidian District, Beijing, China
Postal code : 100083
E-mail:hkxb@buaa.edu.cn
Total visits: 6658907 Today visits: 1341All copyright © editorial office of Chinese Journal of Aeronautics
All copyright © editorial office of Chinese Journal of Aeronautics
Total visits: 6658907 Today visits: 1341

