[1] 高正, 陈仁良. 直升机飞行动力学[M]. 科学出版社, 2003: 1-232.
Gao Z, Chen R L. Helicopter Flight Dynamics [M]. Sci-ence Press, 2003: 1-232. (in Chinese).
[2] 陈仁良, 李攀, 吴伟, 等. 直升机飞行动力学数学建模问题[J]. 航空学报, 2017, 38(7): 6-22.
CHEN R L, LI P, WU W, et al. A review of mathematical modeling of helicopter flight dynamics[J]. Acta Aero-nautica et Astronautica Sinica, 2017, 38(7): 6-22 (in Chi-nese).
[3] 李攀. 旋翼非定常自由尾迹及高置信度直升机飞行力学建模研究[D]. 南京: 南京航空航天大学, 2010: 1-169.
LI P. Research on the rotor unsteady free-vortex wake and high-fidelity mathematical modeling of helicopter flight dynamics[D]. Nanjing: Nanjing University of Aeronautics and Astronautics, 2010: 1-169. (in Chinese).
[4] Balas G J, Packard A K, Renfrow J, et al. Control of the F-14 aircraft lateral-directional axis during powered ap-proach[J]. Journal of Guidance, Control, and Dynamics, 1998, 21(6): 899-908.
[5] 郑峰婴, 沈志敏, 李雅琴, 等. 共轴高速直升机增益自适应多模式切换控制[J]. 航空学报, 2024, 45(09): 219-236.
ZHENG F Y, SHEN Z M, LI Y Q, et al. Gain adaptive multi-mode switching control for coaxial high-speed heli-copter [J]. Acta Aeronauticaet AstronauticaSinica, 2024, 45(9): 219-236. (in Chinese).
[6] Catak A, Altunkaya E C, Demir M, et al. Enhanced Flight Envelope Protection: A Novel Reinforcement Learning Approach[J]. IFAC-PapersOnLine, 2024, 58(30): 207-212.
[7] Wise K A. Design parameter tuning in adaptive observer-based flight control architectures[M] AIAA Information Systems-AIAA Infotech@ Aerospace. 2018: 2-48.
[8] 仇钰清, 李俨, 郎金溪, 等. 高速直升机过渡模态鲁棒自适应姿态控制[J]. 航空学报, 2024, 45(09): 248-261.
QIU Y Q, LI Y, LANG J X, et al. Robust adaptive atti-tude control of high-speed helicopters in transition mode[J]. Acta Aeronauticaet Astronautica Sinica, 2024, 45(09): 248-261. (in Chinese).
[9] Lake B M, Baroni M. Human-like systematic generaliza-tion through a meta-learning neural network[J]. Nature, 2023, 623(7985): 115-121.
[10] Sutton R S, Barto A G. Reinforcement learning: An in-troduction[M]. Cambridge: MIT press, 1998: 1-322.
[11] S?nmez S, Rutherford M J, Valavanis K P. A Survey of Offline-and Online-Learning-Based Algorithms for Multi-rotor Uavs[J]. Drones, 2024, 8(4): 1-116.
[12] Richter D J, Calix R A, Kim K. A Review of Reinforce-ment Learning for Fixed-Wing Aircraft Control Tasks[J]. IEEE Access, 2024, 12: 103026-103048.
[13] Shadeed O, Hasanzade M, Koyuncu E. Deep Reinforce-ment Learning based Aggressive Flight Trajectory Track-er[C]. AIAA scitech 2021 Forum. 2021: 0777-0779.
[14] Manukyan A, Olivares-Mendez M A, Geist M, et al. Deep reinforcement learning-based continuous control for multicopter systems[C]. 2019 6th International Con-ference on Control, Decision and Information Technolo-gies (CoDIT). IEEE, 2019: 1876-1881.
[15] Hwangbo J, Sa I, Siegwart R, et al. Control of a quad-rotor with reinforcement learning[J]. IEEE Robotics and Automation Letters, 2017, 2(4): 2096-2103.
[16] Koch W, Mancuso R, West R, et al. Reinforcement learn-ing for UAV attitude control[J]. ACM Transactions on Cyber-Physical Systems, 2019, 3(2): 1-21.
[17] Xu J, Du T, Foshey M, et al. Learning to fly: computa-tional controller design for hybrid uavs with reinforce-ment learning[J]. ACM Transactions on Graphics (TOG), 2019, 38(4): 1-12.
[18] Lopes G C, Ferreira M, da Silva Sim?es A, et al. Intelli-gent control of a quadrotor with proximal policy optimiza-tion reinforcement learning[C]. 2018 Latin American Ro-botic Symposium, 2018 Brazilian Symposium on Robot-ics (SBR) and 2018 Workshop on Robotics in Education (WRE). IEEE, 2018: 503-508.
[19] Molchanov A, Chen T, H?nig W, et al. Sim-to-(multi)-real: Transfer of low-level robust control policies to multiple quadrotors[C]. 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019: 59-66.
[20] Li Z, Xue S, Lin W, et al. Training a robust reinforcement learning controller for the uncertain system based on poli-cy gradient method[J]. Neurocomputing, 2018, 316: 313-321.
[21] Zhen Y, Hao M, Sun W. Deep reinforcement learning attitude control of fixed-wing UAVs[C]. 2020 3rd Inter-national Conference on Unmanned Systems (ICUS). IEEE, 2020: 239-244.
[22] Bekar C, Yuksek B, Inalhan G. High fidelity progressive reinforcement learning for agile maneuvering UAVs[C]. AIAA Scitech 2020 Forum. 2020: 0898-0900.
[23] Kim J, Jung S. Enhancing UAV Stability: A Deep Rein-forcement Learning Strategy[C]. 2024 International Con-ference on Electronics, Information, and Communication (ICEIC). IEEE, 2024: 1-4.
[24] Aoun C, Moncayo H. Disturbance Observer-based Rein-forcement Learning Control and the Application to a Non-linear Dynamic System[C]. AIAA SCITECH 2022 Fo-rum. 2022: 1586-1588.
[25] Wang Y, Sun J, He H, et al. Deterministic policy gradient with integral compensator for robust quadrotor control[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2019, 50(10): 3713-3725.
[26] Akhtar M, Maqsood A. Comparative Analysis of Deep Reinforcement Learning Algorithms for Hover-to-Cruise Transition Maneuvers of a Tilt-Rotor Unmanned Aerial Vehicle[J]. Aerospace, 2024, 11(12): 1040-1042.
[27] Puterman M L. Markov decision processes[J]. Hand-books in operations research and management science, 1990, 2: 331-434.
[28] Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms[DB/OL]. arxiv preprint arxiv: 1707. 06347, 2017.
[29] Schulman J, Moritz P, Levine S, et al. High-dimensional continuous control using generalized advantage estima-tion[DB/OL]. arxiv preprint arxiv: 1506. 02438, 2015.
[30] Grondman I, Busoniu L, Lopes G A D, et al. A survey of actor-critic reinforcement learning: Standard and natural policy gradients[J]. IEEE Transactions on Systems, Man, and Cybernetics, part C (applications and reviews), 2012, 42(6): 1291-1307.
[31] Borisov A, Mamaev I S. Rigid body dynamics[M]. Walter de Gruyter GmbH & Co KG, 2018: 1-271.
[32] Pan, Li and Chen Ren liang. A Mathematical Model for Helicopter Comprehensive Analysis[J]. Chinese Journal of Aeronautics 23 (2010): 320-326.
[33] Pitt D M, Peters D A. Theoretical prediction of dynamic inflow derivatives[J]. Vertica, 1981, 5(1): 21-34.
[34] Ballin M G. Validation of a real-time engineering simula-tion of the UH-60A helicopter[R]. NASA: NASA-TM-88360, 1987.
[35] Andrychowicz M, Raichuk A, Stańczyk P, et al. What matters for on-policy deep actor-critic methods? a large-scale study[C]. International conference on learning rep-resentations. 2021: 1-10.
[36] Wu Y F, Zhang W, Xu P, et al. A finite-time analysis of two time-scale actor-critic methods[J]. Advances in Neu-ral Information Processing Systems, 2020, 33: 17617-17628.
[37] Welcer M, Szczepański C, Krawczyk M. The impact of sensor errors on flight stability[J]. Aerospace, 2022, 9(3): 169.
[38] Tripathi S, Wagh P, Chaudhary A B. Modelling, simula-tion & sensitivity analysis of various types of sensor er-rors and its impact on Tactical Flight Vehicle naviga-tion[C]. 2016 International Conference on electrical, elec-tronics, and optimization techniques (ICEEOT). IEEE, 2016: 938-942.
[39] Zheng T, Xu A, Xu X, et al. Modeling and compensation of inertial sensor errors in measurement systems[J]. Elec-tronics, 2023, 12(11): 2458.