ACTA AERONAUTICAET ASTRONAUTICA SINICA
Jun-Peng HUI1,
Received:
2022-05-11
Revised:
2023-02-05
Online:
2023-02-06
Published:
2023-02-06
Contact:
Jun-Peng HUI
Jun-Peng HUI. Research of intelligent guidance for no-fly zone avoidance based on reinforcement learning[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, doi: 10.7527/S1000-6893.2022.27416.
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: https://hkxb.buaa.edu.cn/EN/10.7527/S1000-6893.2022.27416
[1] 包为民. 航天飞行器控制技术研究现状与发展趋势[J]. 自动化学报, 2013, 39(06): 697-702.BAO W M. Present situation and development ten-dency of aerospace control techniques[J]. Acta Auto-matica Sinica, 2013, 39(06): 697-702 (in Chinese).[2] 高长生, 陈尔康, 荆武兴. 高超声速飞行器机动规避轨迹优化[J]. 哈尔滨工业大学学报, 2017, 49(4): 16-21.GAO C S, CHEN E K, JING W X. Maneuver evasion trajectory optimization for hypersonic vehicles[J]. Journal of Harbin Institute of Technology, 2017, 49(4): 16-21 (in Chinese).[3] 李柯, 聂万胜, 冯必鸣. 助推-滑翔飞行器规避能力研究[J]. 飞行力学, 2013, 31(2): 148-151+156.LI K, NIE W S, FENG B M. Research on elusion ca-pability of boost-glide vehicle[J]. Flight Dynamics, 2013, 31(2): 148-151+156 (in Chinese).[4] 卢青, 周军, 周敏. 考虑禁飞区的高超声速飞行器再入制导[J]. 西北工业大学学报, 2017, 35(5): 749-754.LU Q, ZHOU J, ZHOU M. Reentry guidance for hy-personic vehicle considering no-fly zone[J]. Journal of Northwestern Polytechnical University, 2017, 35(5): 749-754 (in Chinese).[5] 高兴, 张璐, 韦常柱. 面向禁飞区约束的再入滑翔飞行器快速轨迹规划[J]. 战术导弹技术, 2018, 5: 62-67+94.GAO X, ZHANG L, WEI C Z. Rapid trajectory plan-ning for reentry glide vehicle satisfying no-fly zone constraint[J]. Tactical Missile Technology, 2018, 5: 62-67+94 (in Chinese).[6] 赵江, 周锐, 张超. 考虑禁飞区规避的预测校正再入制导方法[J]. 北京航空航天大学学报, 2015, 41(5): 864-870.ZHANG J, ZHOU R, ZHANG C. Predictor-corrector reentry guidance satisfying no-fly zone constraints[J]. Journal of Beijing University of Aeronautics and As-tronautics, 2015, 41(5): 864-870 (in Chinese).[7] LIANG Z X, LIU S Y, LI Q D, et al. Lateral entry guidance with no-fly zone constraint[J]. Aerospace Science and Technology, 2017, 60: 39–47.[8] ZHANG D, LIU L, WANG Y J. On-line reentry guid-ance algorithm with both path and no-fly zone con-straints[J]. Acta Astronautica, 2015, 117: 243-253.[9] 赵亮博, 徐玮, 董超, 等. 基于虚拟目标导引的高速飞行器禁飞区规避制导方法研究[J]. 中国科学: 物理学 力学 天文学, 2021, 51(10): 104706.ZHAO L B, XU W, DONG C, et al. Evasion guidance of re-entry vehicle satisfying no-fly zone constraints based on virtual goals[J]. Sci SinPhys Mech Astron, 2021, 51(10): 104706 (in Chinese).[10] 章吉力, 周大鹏, 杨大鹏, 等. 禁飞区影响下的空天飞机可达区域计算方法[J]. 航空学报, 2021, 42(8): 525771.ZHANG J L, ZHOU D P, YANG D P, et al. Computa-tion method for reachable domain of aerospace plane under the influence of no-fly zone[J]. Acta Aero-nautics et Astronautica Sinica, 2021, 42(8): 525771 (in Chinese).[11] 章吉力, 刘凯, 樊雅卓, 等. 考虑禁飞区规避的空天飞行器分段预测校正再入制导方法[J]. 宇航学报, 2021, 42(1): 122-131.ZHANG J L, LIU K, FAN Y Z, et al. A piecewise pre-dictor-corrector re-entry guidance algorithm with no-fly zone avoidance[J]. Journal of Astronautics, 2021, 42(1): 122-131 (in Chinese).[12] LIANG Z X, REN Z. Tentacle-based guidance for entry flight with no-fly zone constraint[J]. Journal of Guid-ance, Control, and Dynamics, 2018, 41(4): 991-1000.[13] 高杨, 蔡光斌, 徐慧, 等. 虚拟多触角探测的高超声速滑翔飞行器再入机动制导[J]. 航空学报, 2020, 41(11): 623703.GAO Y, CAI G B, XU H, et al. Reentry maneuver guidance of hypersonic glide vehicle under virtual multi-tentacle detection[J]. Acta Aeronautics et As-tronautica Sinica, 2020, 41(11): 623703 (in Chinese).[14] LI Z H, YANG X J, SUN X D, et al. Improved artificial potential field based lateral entry guidance for-waypoints passage and no-fly zones avoidance[J]. Aerospace Science and Technology, 2019, 86: 119-131.[15] YU W B, CHEN W C, JIANG Z G, et al. Analytical entry guidance for no-fly-zone avoidance[J]. Aero-space Science and Technology, 2018, 72: 426-422.[16] SUTTON R S, BARTO A G. Reinforcement learning: An introduction[M]. Cambridge, Massachusetts: The MIT Press, 2011: 119-138.[17] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Hu-man-level control through deep reinforcement learn-ing[J]. Nature, 2015, 518(7540): 529-533.[18] LILLICRAP T P, HUNT J J, PRITZEL A, et al. Con-tinuous control with deep reinforcement learning[J]. arXiv preprint arXiv:1509.02971, 2015.[19] HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforce-ment learning with a stochastic actor[J]. arXiv pre-print arXiv:1801.01290, 2018.[20] SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[J]. arXiv preprint arXiv:1707.06347, 2017.[21] 程林,蒋方华,李俊峰.深度学习在飞行器动力学与控制中的应用研究综述[J]. 力学与实践, 2020, 42(03): 267-276.CHENG L, JIANG F H, LI J F. A review on the appli-cations of deep learning in aircraft dynamics and con-trol[J]. Mechanics in Engineering, 2020, 42(03): 267-276 (in Chinese).[22] 余跃,王宏伦.基于深度学习的高超声速飞行器再入预测校正容错制导[J]. 兵工学报, 2020, 41(04): 656-669.YU Y, WANG H L. Deep learning-based reentry pre-dictor-corrector fault-tolerant guidance for hypersonic vehicles[J]. Acta Armamentarii, 2020, 41(04): 656-669 (in Chinese).[23] SHI Y, WANG Z B. A deep learning-based approach to real-time trajectory optimization for hypersonic vehi-cles[C]// Orlando: AIAA Scitech 2020 Forum, 2020.[24] CHENG L, JIANG F H, WANG Z B, et al. Multi-constrained real-time entry guidance using deep neu-ral networks[J]. IEEE Transactions on aerospace and electronic systems, 2021, 57(1): 325-340.[25] 张秦浩, 敖百强, 张秦雪. Q-learning 强化学习制导律[J]. 系统工程与电子技术, 2020, 42(2): 414-419.ZHANG Q H, AO B Q, ZHANG Q X. Reinforcement learning guidance law of Q-learning[J]. Journal of Systems Engineering and Electronics, 2020, 42(2): 414-419 (in Chinese).[26] GAUDET B, FURFARO R, LINARES R. Reinforce-ment meta-learning for angle-only intercept guidance of maneuvering targets[C]// Orlando: AIAA Scitech 2020 Forum, 2020.[27] HOVELL K, ULRICH S. Deep reinforcement learning for spacecraft proximity operations guidance[J]. Jour-nal of spacecraft and rockets, 2021, 58(2): 254-264.[28] HOVELL K, ULRICH S. On deep reinforcement learn-ing for spacecraft guidance[C]// Orlando: AIAA Scitech 2020 Forum, 2020.[29] 郭冬子, 黄荣, 许河川, 等. 高速飞行器深度确定性策略梯度制导方法研究[J]. 系统工程与电子技术, 2021,https://kns.cnki.net/kcms/detail/11.2422.tn.20210928.1102.019.htmlGUO D Z, HUANG R, XU H C, et al. Research on deep deterministic policy gradient reinforcement learning guidance method for reentry vehicle[J]. Sys-tems Engineering and Electronics, 2021, https://kns.cnki.net/kcms/detail/11.2422.tn.20210928.1102.019.html (in Chinese).[30] 刘扬, 何泽众, 王春宇, 等. 基于DDPG算法的末制 导律设计研究[J]. 计算机学报, 2021, 44(9): 1854-1865.LIU Y, HE Z Z, WANG C Y, et al. Terminal guidance law design based on DDPG algorithm[J]. Chinese Journal of Computers, 2021, 44(9): 1854-1865 (in Chinese).[31] CHAI R Q, TSOURDOS A, SAVVARIS A, et al. Six-DOF Spacecraft Optimal Trajectory Planning and Re-al-Time Attitude Control: A Deep Neural Network-Based Approach[J]. IEEE transactions on neural net-works and learning systems, 2020, 31(11): 5005-5013.[32] 黄旭, 柳嘉润, 贾晨辉, 等. 深度确定性策略梯度算法用于无人飞行器控制[J]. 航空学报, 2021, 42(11): 524688.HUANG X, LIU J R, JIA C H, et al. Deep determinis-tic policy gradient algorithm for UAV control[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(11): 524688 (in Chinese).[33] 裴培, 何绍溟, 王江, 等. 一种深度强化学习制导控制一体化算法[J]. 宇航学报, 2021, 42(10): 1293-1304.PEI P, HE S M, WANG J, et al. Integrated guidance and control for missile using deep reinforcement learning[J]. Journal of Astronautics, 2021, 42(10): 1293-1304 (in Chinese).[34] 郭继峰, 陈宇燊, 白成超. 基于强化学习的在轨目标逼近[J]. 航天控制, 2021, 39(5): 44-50.GUO J F, CHEN Y S, BAI C C. On-orbit target ap-proach based on reinforcement learning[J]. Aerospace Control, 2021, 39(5): 44-50 (in Chinese).[35] 方科, 张庆振, 倪昆, 等. 高超声速飞行器时间协同再入制导[J]. 航空学报, 2018, 39(5): 197-212.FANG K, ZHANG Q Z, NI K, et al. Time-coordination reentry guidance law for hypersonic ve-hicle[J]. Acta Aeronautics et Astronautica Sinica, 2018, 39(5): 197-212 (in Chinese).[36] 周宏宇, 王小刚, 单永志, 等. 基于改进粒子群算法的飞行器协同轨迹规划[J]. 自动化学报, 2020, 46(x): 1?7. DOI: 10.16383/j.aas.c190865.ZHOU H Y, WANG X G, SHAN Y Z, et al. Synergis-tic path planning for multiple vehicles based on an improved particle swarm optimization method[J]. Ac-ta Automatica Sinica, 2020, 46(x): 1?7(in Chi-nese). DOI: 10.16383/j.aas.c190865.[37] 张晚晴, 余文斌, 李静琳, 等. 基于纵程解析解的飞行器智能横程机动再入协同制导[J]. 兵工学报, 2021, 42(7): 1400-1411.ZHANG W Q, YU W B, LI J L, Cooperative reentry guidance for intelligent lateral maneuver of hyperson-ic vehicle based on downrange analytical solution[J]. Acta Armamentarii, 2021, 42(7): 1400-1411 (in Chi-nese).[38] SILVER D, HUANG A, MADDISON C J, et al. Mas-tering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529: 484-489.[39] SUTSKEVER I, MARTENS J, DAHL G, et al. On the importance of initialization and momentum in deep learning[C]//International conference on machine learning. 2013: 1139-1147.[40] GLOROT X, BENGIO Y. Understanding the difficulty of training deep feedforward neural net-works[C]//Proceedings of the thirteenth international conference on artificial intelligence and statistics. 2010: 249-256.[41] SUTTON R S, MCALLESTER D A, SINGH S P, et al. Policy gradient methods for reinforcement learning with function approximation[C]//Advances in neural information processing systems. 2000: 1057-1063.[42] SCHULMAN J, LEVINE S, ABBEEL P, et al. Trust region policy optimization[C]//International confer-ence on machine learning. 2015: 1889-1897. |
[1] | HUI Junpeng, WANG Ren, YU Qidong. Generating new quality flight corridor for reentry aircraft based on reinforcement learning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2022, 43(9): 325960-325960. |
[2] | FU Xiaowei, WANG Hui, XU Zhe. Cooperative pursuit strategy for multi-UAVs based on DE-MADDPG algorithm [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2022, 43(5): 325311-325311. |
[3] | LUO Qing, ZHANG Tao, SHAN Peng, ZHANG Wentao, LIU Zihao. Generating reconfiguration blueprints for IMA systems based on improved Q-learning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(8): 525792-525792. |
[4] | SUN Zhixiao, YANG Shengqi, PIAO Haiyin, BAI Chengchao, GE Jun. A survey of air combat artificial intelligence [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(8): 525799-525799. |
[5] | REN Feng, GAO Chuanqiang, TANG Hui. Machine learning for flow control: Applications and development trends [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(4): 524686-524686. |
[6] | LI Runze, ZHANG Yufei, CHEN Haixin. Reinforcement learning method for supercritical airfoil aerodynamic design [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(4): 523810-523810. |
[7] | XIANG Xiaojia, YAN Chao, WANG Chang, YIN Dong. Coordination control method for fixed-wing UAV formation through deep reinforcement learning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(4): 524009-524009. |
[8] | YANG Jianan, HOU Xiaolei, HU Yu Hen, LIU Yong, PAN Quan, FENG Qian. Heuristic enhanced reinforcement learning method for large-scale multi-debris active removal mission planning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2021, 42(4): 524354-524354. |
[9] | CHEN Bin, WANG Jiang, WANG Yang. Intelligent virtual training partner in embedded training system of fighter [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(6): 523467-523467. |
[10] | LIU Bingyan, YE Xiongbing, ZHOU Chifei, LIU Biliu. Allocation of composite mode on-orbit service resource based on improved DQN [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(5): 323630-323630. |
[11] | LIU Yuxuan, LIU Hu, TIAN Yongliang, SUN Cong. Distributed control method of multiple UAVs for persistent wildfire surveillance [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(2): 323381-323381. |
[12] | CHEN Can, MO Li, ZHENG Duo, CHENG Ziheng, LIN Defu. Cooperative attack-defense game of multiple UAVs with asymmetric maneuverability [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(12): 324152-324152. |
[13] | LIU Bingyan, YE Xiongbing, GAO Yong, WANG Xinbo, NI Lei. Strategy solution of non-cooperative target pursuit-evasion game based on branching deep reinforcement learning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(10): 324040-324040. |
[14] | ZHANG Yaozhong, XU Jialin, YAO Kangjia, LIU Jieling. Pursuit missions for UAV swarms based on DDPG algorithm [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2020, 41(10): 324000-324000. |
[15] | ZUO Jialiang, YANG Rennong, ZHANG Ying, LI Zhonglin, WU Meng. Intelligent decision-making in air combat maneuvering based on heuristic reinforcement learning [J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2017, 38(10): 321168-321168. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
Address: No.238, Baiyan Buiding, Beisihuan Zhonglu Road, Haidian District, Beijing, China
Postal code : 100083
E-mail:hkxb@buaa.edu.cn
Total visits: 6658907 Today visits: 1341All copyright © editorial office of Chinese Journal of Aeronautics
All copyright © editorial office of Chinese Journal of Aeronautics
Total visits: 6658907 Today visits: 1341