[1] 符文星, 郭行, 闫杰. 智能无人飞行器技术发展趋势综述[J]. 无人系统技术, 2019, 2(4):31-37. FU W X, GUO H, YAN J, et al. Overview on the technology development trend of intelligent unmanned aerial vehicle[J]. Unmanned Systems Technology, 2019, 2(4):31-37(in Chinese). [2] FLANAGAN J, STRUTZENBERG R, MYERS R, et al. Development and flight testing of a morphing aircraft, the NextGen MFX-1[C]//48th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference. Reston:AIAA, 2007:23-26. [3] 雷旭升, 陶冶. 小型无人飞行器风场扰动自适应控制方法[J]. 航空学报, 2010, 31(6):1171-1176. LEI X S, TAO Y. Adaptive control for small unmanned aerial vehicle under wind disturbance[J]. Acta Aeronautica et Astronautica Sinica, 2010, 31(6):1171-1176(in Chinese). [4] XU R, OZGUNER U. Sliding mode control of a quadrotor helicopter[C]//Proceedings of the 45th IEEE Conference on Decision and Control. Piscataway:IEEE, 2006:4957-4962. [5] 刘德元, 刘昊, LEWIS F L. 尾座式无人飞行器鲁棒容错编队控制[J]. 航空学报, 2021, 42(2):324296. LIU D Y, LIU H, LEWIS F L. Robust fault-tolerant formation control for tail-sitters[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(2):324296(in Chinese). [6] 党小为, 唐鹏, 孙洪强, 等. 基于角加速度估计的非线性增量动态逆控制及试飞[J]. 航空学报, 2020, 41(4):323534. DANG X W, TANG P, SUN H Q, et al. Incremental nonlinear dynamic inversion control and flight test based on angular acceleration estimation[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(4):323534(in Chinese). [7] 陈书钊, 楚龙飞, 杨秀梅, 等. 状态预测神经网络控制应用于小型可回收火箭[J]. 航空学报, 2019, 40(3):322286. CHEN S Z, CHU L F, YANG X M, et al. Application of state prediction neural network control algorithm in small reusable rocket[J]. Acta Aeronautica et Astronautica Sinica, 2019, 40(3):322286(in Chinese). [8] 刘金琨. 智能控制[M]. 4版. 北京:电子工业出版社, 2017:178-179. LIU J K. Intelligent control[M]. 4th ed. Beijing:Publishing House of Electronics Industry, 2017:178-179(in Chinese). [9] NG A Y, COATES A, DIEL M, et al. Autonomous inverted helicopter flight via reinforcement learning[M]//Experimental Robotics IX. Berlin, Heidelberg:Springer, 2006:363-372. [10] ABBEEL P, COATES A, QUIGLEY M, et al. An application of reinforcement learning to aerobatic helicopter flight[C]//Advances in Neural Information Processing Systems 19:Proceedings of the 2006 Conference. Cambridge:MIT Press, 2007:1-8. [11] SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]//31st International Conference on Machine Learning, 2014:387-395. [12] LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[DB/OL]. arXiv preprint:1509.02971, 2015. [13] SCHULMAN J, LEVINE S, MORITZ P, et al. Trust region policy optimization[DB/OL]. arXiv preprint:1502.05477, 2015. [14] SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[DB/OL]. arXiv preprint:1707.06347, 2017. [15] HWANGBO J, SA I, SIEGWART R, et al. Control of a quadrotor with reinforcement learning[J]. IEEE Robotics and Automation Letters, 2017, 2(4):2096-2103. [16] KOCH W, MANCUSO R, WEST R, et al. Reinforcement learning for UAV attitude control[DB/OL]. arXiv preprint:1804.04154, 2018. [17] LIN X B, YU Y, SUN C Y. Supplementary reinforcement learning controller designed for quadrotor UAVs[J]. IEEE Access, 2019, 7:26422-26431. [18] WANG Y D, SUN J, HE H B, et al. Deterministic policy gradient with integral compensator for robust quadrotor control[J]. IEEE Transactions on Systems, Man, and Cybernetics:Systems, 2020, 50(10):3713-3725. [19] 冯超. 强化学习精要:核心算法与TensorFlow实现[M]. 北京:电子工业出版社, 2018. FENG C. Essentials of reinforcement learning:Core algorithm and TensorFlow implementation[M]. Beijing:Publishing House of Electronics Industry, 2018(in Chinese). [20] KONDA V R, TSITSIKLIS J N. Actor-critic algorithms[C]//Advances in Neural Information Processing Systems, 2000:1008-1014. [21] WATKINS C J C H. Learning from delayed rewards[D]. Cambridge:University of Cambridge, 1989. [22] SUTTON R S. Learning to predict by the methods of temporal differences[J]. Machine Learning, 1988, 3(1):9-44. [23] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[C]//26th Neural Information Processing Systems, 2013:201-220. [24] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540):529-533. |