Reinforcement-enhanced particle swarm optimization of maneuver trajectories and total-energy anti-disturbance control

DONG Zhe; LIU Kai; WANG Guan; WANG Zhen-Wei

doi:10.7527/S1000-6893.2026.33379

ACTA AERONAUTICAET ASTRONAUTICA SINICA >

0 1 - 0

DOI: https://doi.org/10.7527/S1000-6893.2026.33379

Reinforcement-enhanced particle swarm optimization of maneuver trajectories and total-energy anti-disturbance control

DONG Zhe ,
LIU Kai ,
WANG Guan ,
WANG Zhen-Wei

Expand

Received date: 2026-01-15

Revised date: 2026-04-28

Online published: 2026-04-30

Fold

Abstract

For high dynamic within visual range air combat of unmanned combat aerial vehicle (UCAV), this paper addresses tactical-maneuver trajectory optimization and trajectory tracking control. By jointly considering air combat situational metrics and maneuver energy, a lightweight reinforcement learning particle swarm trajectory optimizer and a total energy linearized tracking control framework are developed for four representative tactical maneuvers: Loop, Immelmann, High Yo-Yo, and Barrel Roll. First, an angle-of-attack (AOA) coupled thrust model and a load factor envelope model incorporating maneuvering energy information are established to support maneuver optimization and total energy control design under propulsion–airframe coupling. Then, using the rates of change of AOA, throttle, and velocity roll angle as command primitives, the segmented command models are constructed for the four maneuvers. An improved particle swarm optimization algorithm is then proposed by integrating a Q-learning-driven particle learning paradigm and a K-nearest neighbor differential-evolution mechanism, enabling maneuver trajectory optimization with respect to relative geometry, attitude, load factor, and thrust work-related metrics. Thereafter, the optimized trajectory commands are converted into total energy rate and energy-allocation rate commands. A total energy linearized tracking law is derived via linearization of the augmented total energy dynamics and desired time-domain response-based gain assignment. An adaptive correction factor based on online aerodynamic identification is further introduced to compensate for aerodynamic perturbations. Finally, numerical simulations on the four tactical maneuvers validate the proposed intelligent maneuver optimizer and total energy linearized controller. Results demonstrate reduced maneuver energy loss and optimization time, and significantly improved tracking accuracy and robustness under propulsion–airframe coupling and aerodynamic perturbations.

Key words： tactical maneuvers optimization; reinforcement learning; evolutionary particle swarm optimization; maneuvering energy; total energy control; AOA/thrust coupling

Cite this article

DONG Zhe , LIU Kai , WANG Guan , WANG Zhen-Wei . Reinforcement-enhanced particle swarm optimization of maneuver trajectories and total-energy anti-disturbance control[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 0 : 1 -0 . DOI: 10.7527/S1000-6893.2026.33379

References

[1] 郑锐平, 史静平, 李天宇, 等.基于预设时间的固定翼无人机紧密编队控制[J/OL]. 航空学报, (2025-08-29) [2026-01-08]. https://hkxb.buaa.edu.cn/CN/10.7527/S1000-6893.2025.32496. doi: 10.7527/S1000-6893.2025.32496.
ZHENG R P, SHI J P, LI T Y, et al. Prescribed-Time Coordinated Control for Fixed-Wing UAV Close Formation[J/OL]. Acta Aeronautica et Astronautica Sinica, (2025-08-29) [2026-01-08]. https://hkxb.buaa.edu.cn/CN/10.7527/S1000-6893.2025.32496. doi: 10.7527/S1000-6893.2025.32496 (in Chinese).
[2] 罗越群, 丁达理, 谭目来, 等. 无人作战飞机自主机动决策方法综述[J]. 航空学报, 2025, 46(07): 30-59.
LUO Y Q, DING D L, TAN M L, et al. A review of autonomous maneuver decision methods for unmanned combat aerial vehicle[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(7): 030877 (in Chinese).
[3] AUSTIN F, CARBONE G, FALCO M, et al. Automated maneuvering decisions for air-to-air combat[C]// Guidance, navigation, and control conference. Reston, VA: AIAA, 1987: 2393.
[4] WANG X W, WANG Y H, SU X C, et al. Deep reinforcement learning-based air combat maneuver decision-making: literature review, implementation tutorial and future direction[J]. Artificial Intelligence Review, 2024, 57(1): 1.
[5] ROBERT L S. Fighter combat: tactics and maneuvering[M]. Annapolis, Maryland: United States Naval Institute Press, 1985: 62-97.
[6] AIR EDUCATION AND TRAINING COMMAND. Tactical doctrine—T-38C employment fundamentals/ introduction to fighter fundamentals (IFF): AETCTTP 11-1[R]. San Antonio, Texas: Air Education and Training Command, 2024.
[7] 谭目来, 朱文强, 刘远飞. 基于机动动作库的UCAV逃逸机动决策[J]. 无人系统技术, 2020, 3(04): 73-82.
TAN M L, ZHU W Q, LIU Y F. UCAV Escape maneuver decision based on maneuver library [J]. Unmanned Systems Technology, 2020, 3(04): 73-82 (in Chinese).
[8] URE N K, INALHAN G. Design of a multi modal control framework for agile maneuvering UCAV[C]// 2009 IEEE Aerospace conference. Piscataway, NJ: IEEE, 2009: 1-10.
[9] LEE D H, KIM C J, HUR S W, et. al. Implementation of tactical maneuvers with maneuver libraries[J]. Chinese Journal of Aeronautics, 2020, 33(1): 255-270.
[10] 杨森, 张翔伦. 基于能量优化的无人机机动轨迹生成方法[J]. 航空学报, 2020, 41(S2): 122-128.
YANG S, ZHANG X L. Energy optimized maneuver trajectory generation for unmanned aerial vehicles[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(S2): 122-128 (in Chinese).
[11] BAI X, JIANG H K, CUI J J, et al. UAV path planning based on improved A* and DWA algorithms[J]. International Journal of Aerospace Engineering, 2021, 2021(1): 4511252.
[12] BASSOLILLO S R, RASPAOLO G, BLASI L, et al. Path planning for fixed-wing unmanned aerial vehicles: An integrated approach with theta* and clothoids[J]. Drones, 2024, 8(2): 62.
[13] NOREEN I, KHAN A, HABIB Z. Optimal path planning using RRT* based approaches: a survey and future directions[J]. International Journal of Advanced Computer Science and Applications, 2016, 7(11): 97-107.
[14] RYU S K, JUNG M J, FREW E, et al. Wind Aware Batch Informed Trees for Path Planning of Small UAS to Minimize Travel Time[C]// AIAA AVIATION FORUM AND ASCEND 2025. Reston, VA: AIAA, 2025: 3687.
[15] XU J Y, SHI M J, TANG F, et al. Dubins-A*: A New Global Path Planning Scheme for Fixed-Wing UAV with Irregular Obstacles Avoidance[C]// 2024 7th International Conference on Electronics Technology (ICET). Piscataway, NJ: IEEE, 2024: 636-641.
[16] ARMESTO L, VANEGAS G, GIRBES-JUAN V. Elementary Clothoid-Based Three-Dimensional Curve for Unmanned Aerial Vehicles[J]. Journal of Guidance, Control, and Dynamics, 2022, 45(12): 2421-2431.
[17] PARK J, KIM I, SUK J, et al. Trajectory optimization for takeoff and landing phase of UAM considering energy and safety[J]. Aerospace Science and Technology, 2023, 140: 108489.
[18] HUNTINGTON G T, BENSON D, RAO A V. A comparison of accuracy and computational efficiency of three pseudospectral methods[C]// AIAA guidance, navigation and control conference and exhibit. Reston, VA: AIAA, 2007: 6405.
[19] SUN J, XU G T, WANG Z, et al. Safe flight corridor constrained sequential convex programming for efficient trajectory generation of fixed-wing UAVs[J]. Chinese Journal of Aeronautics, 2025, 38(1): 103174.
[20] ELANGO P, LUO D Y, KAMATH A G, et al. Continuous-time successive convexification for constrained trajectory optimization[J]. Automatica, 2025, 180: 112464.
[21] 赵畅, 刘允刚, 陈琳, 等. 面向元启发式算法的多无人机路径规划现状与展望[J]. 控制与决策, 2022, 37(05): 1102-1115.
ZHAO C, LIU Y G, CHEN L, et. al. Research and development trend of multi-UAV path planning based on metaheuristic algorithm[J]. Control and Decision, 2022, 37(05): 1102-1115 (in Chinese).
[22] PARSOPOULOS K E, VRAHATIS M N. Recent approaches to global optimization problems through particle swarm optimization[J]. Natural computing, 2002, 1(2): 235-306.
[23] ZHANG X M, WANG X, KANG Q, et al. Differential mutation and novel social learning particle swarm optimization algorithm[J]. Information Sciences, 2019, 480: 109-129.
[24] WEI B, HUANG J Y, DENG L, et al. Reinforcement learning-based particle swarm optimization with adaptive scoring mechanism for high-dimensional feature selection[J]. Swarm and Evolutionary Computation, 2025, 98: 102104.
[25] Lee H, Kim H J. Trajectory tracking control of multirotors from modelling to experiments: A survey[J]. International Journal of Control, Automation and Systems, 2017, 15(1): 281-292.
[26] PFEIFLE O, FICHTER W. Cascaded incremental nonlinear dynamic inversion for three-dimensional spline-tracking with wind compensation[J]. Journal of Guidance, Control, and Dynamics, 2021, 44(8): 1559-1571.
[27] TRAN T, NEWMAN B. Back-stepping based flight path angle control algorithm for longitudinal dynamics[C]// Reston, VA: AIAA Guidance, Navigation, and Control Conference. 2012: 4612.
[28] YARESHE F T, MADEBO N W, ABDISSA C M, et al. Trajectory tracking of fixed-wing uav using anfis-based sliding mode controller[J]. IEEE Access, 2025.
[29] HUMAIRA N, KOYUNCU E. A framework for analysis of combat maneuvers input strategy using energy-based metrics. AIAA SciTech 2019 Forum. Reston, VA: AIAA; 2019.
[30] 张庆振, 安锦文. 一种基于飞机总能量控制飞行速度/航迹的解耦控制系统设计新方法[J]. 航空学报, 2004,25(04): 389-392.
ZHANG Q Z, AN J W. A new method for designing decoupling controller of flight speed/flight path based on total energy control [J]. Acta Aeronautica et Astronautica Sinica, 2004,25(04): 389-392 (in Chinese).
[31] ZHU J J, BANKER B D, HALL C E. X-33 ascent flight control design by trajectory linearization-a singular perturbation approach[C]// Reston, VA: AIAA guidance, navigation, and control conference and exhibit. 2000: 4159.
[32] 钱杏芳, 林瑞雄, 赵亚男. 导弹飞行力学[M]. 北京: 北京理工大学出版社, 2006: 36-41.
QIAN X F, LIN R X, ZHAO Y N. Missile flight dynamics[M]. Beijing: Beijing Institute of Technology Press, 2006: 36-41 (in Chinese).
[33] FANG J, ZHENG Q G, CAI C P, et al. Maneuver control at high angle of attack based on real-time optimization of integrated aero-propulsion[J]. Chinese Journal of Aeronautics, 2022, 35(12): 173-188.
[34] ANDERSON J D. Modem compressible flow: with historical perspective[M]. 3rd ed. New York: McGraw-Hill, 2002: 78-81.
[35] National Oceanic and Atmospheric Administration (NOAA); National Aeronautics and Space Administration (NASA). U.S. Standard Atmosphere, 1976: NOAA-S/T-76-1562; NASA-TM-X-74335[R]. Washington, DC: U.S. Government Printing Office, 1976.
[36] SHIN H, LEE J, KIM H, et al. An autonomous aerial combat framework for two-on-two engagements based on basic fighter maneuvers[J]. Aerospace Science and Technology 2018, 72(1): 4305-315.
[37] 董哲, 王振威, 刘晓鹏, 等. 考虑能量边界的通道预设性能干扰抑制滑模导引律[J]. 宇航学报, 2025, 46(07): 1423-1434.
DONG Z, WANG Z W, LIU X P. Sliding mode guidance law with tube prescribed performance disturbance suppression considering energy boundaries[J]. Journal of Astronautics, 2025, 46(07): 1423-1434 (in Chinese).
[38] 吴树范, 蔡维黎, 沈勇璋, 等.飞机总能量控制系统的研究Ⅰ——原理分析与系统设计[J]. 航空学报, 1993, (07): 355-361.
WU S F, CAI W L, SHEN Y Z. Study on the total energy control system of aircraft Ⅰ: ——Principle analysis and system design[J]. Acta Aeronautica et Astronautica Sinica, 1993, (07): 355-361 (in Chinese).
[39] 刘瑛, 李敏强, 张瑞峰. 复杂机动动作最优航迹控制模型及操纵特性分析[J]. 控制理论与应用, 2014, 31(05): 566-576.
LIU Y, LI M Q, ZHANG R F. The optimal trajectory control model of the aircraft maneuver and its operation characteristics[J]. Control Theory & Applications, 2014, 31(05): 566-576 (in Chinese).
[40] 钟友武, 柳嘉润, 杨凌宇, 等. 自主近距空战中机动动作库及其综合控制系统[J]. 航空学报, 2008, (S1): 114-121.
ZHONG Y W, LIU J R, YANG L Y, et al. Maneuver library and integrated control system for autonomous close-in air combat [J]. Acta Aeronautica et Astronautica Sinica, 2008, (S1): 114-121 (in Chinese).
[41] XU J W, ZHANG J, YANG L Y, et al. Autonomous decision-making for dogfights based on a tactical pursuit point approach[J]. Aerospace Science and Technology 2022, 129: 107857.
[42] WANG D S, TAN D P, LIU L. Particle swarm optimization algorithm: an overview[J]. Soft computing, 2018, 22(2): 387-408.
[43] LIU H, ZHANG X W, TU L P. A modified particle swarm optimization using adaptive strategy[J]. Expert systems with applications, 2020, 152: 113353.
[44] XIA P P, ZHANG L, LI F Z. Learning similarity with cosine similarity ensemble[J]. Information sciences, 2015, 307: 39-52.
[45] HRASTOVEC M, SOLINA F. Prediction of aircraft performances based on data collected by air traffic control centers[J]. Transportation Research Part C: Emerging Technologies, 2016, 73: 167-182.
[46] LI W, LIANG P, SUN B, et al. Reinforcement learning-based particle swarm optimization with neighborhood differential mutation strategy[J]. Swarm and Evolutionary Computation, 2023, 78: 101274.
[47] LIU Y H, CAO B Y, LI H H. Improving ant colony optimization algorithm with epsilon greedy and Levy flight[J]. Complex & Intelligent Systems, 2021, 7(4): 1711-1722.
[48] ZENG N Y, ZHANG H, CHEN Y P, et al. Path planning for intelligent robot based on switching local evolutionary PSO algorithm[J]. Assembly automation, 2016, 36(2): 120-126.
[49] 刘凯, 张永亮, 聂聆聪. 一种组合动力飞行器模态转换过程轨迹优化与控制方案[J]. 宇航学报, 2024, 45(03): 443-451.
LIU K, ZHANG Y L, NIE L C. A Trajectory Optimization and Control Scheme for Mode Conversion Process of Turbine-based Combined Cycle Vehicle[J]. Journal of Astronautics, 2024, 45(03): 443-451 (in Chinese).
[50] Shao X, Wang H. Active disturbance rejection based trajectory linearization control for hypersonic reentry vehicle with bounded uncertainties[J]. ISA transactions, 2015, 54: 27-38.
[51] 刘凯, 郭健, 周文雅, 等.吸气式组合动力高超声速飞行器上升段制导方法研究[J]. 宇航学报, 2020, 41(08): 1023-1031.
LIU K, GUO J, ZHOU W Y. Investigation on ascent guidance law for air breathing combined cycle hypersonic vehicle[J]. Journal of Astronautics, 2020, 41(08): 1023-1031 (in Chinese).
[52] 胡寿松. 自动控制原理[M]. 7版. 北京: 科学出版社, 2019: 84-87.
HU S S. Automatic control principle[M]. 7th ed. Beijing: Science Press, 2019: 84-87 (in Chinese).
[53] KHALIL H K, GRIZZLE J W. Nonlinear systems[M]. 3rd ed. Upper Saddle River, NJ: Prentice Hall Press, 2002: 136-139.
[54] HORN R A, JOHNSON C R. Matrix analysis[M]. 2nd ed. New York: Cambridge University Press, 2013: 234-235.
[55] SLOTINE J J E, LI W P. Applied nonlinear control[M]. Englewood Cliffs, New Jersey: Prentice-Hall Press, 1991: 152-153.
[56] SONTAG E D, WANG Y. On characterizations of the input-to-state stability property[J]. Systems & Control Letters, 1995, 24(5): 351-359.
[57] PALEOLOGU C, BENESTY J, CIOCHINA S. A robust variable forgetting factor recursive least-squares algorithm for system identification[J]. IEEE Signal Processing Letters, 2008, 15: 597-600.
[58] TATARI F, MAZOUCHI M, MODARES H. Fixed-time system identification using concurrent learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 34(8): 4892-4902.
[59] CHU H Y, YI J K, Yang F. Chaos particle swarm optimization enhancement algorithm for UAV safe path planning[J]. Applied Sciences, 2022, 12(18): 8977.
[60] IBRAHIM I H, NG E Y K, WONG L W K. Effect of the angle of attack on the YF-16 inlet[J]. International Journal of Aerodynamics, 2010, 1(2): 169-191.
[61] LEAMER P C, KENNON I G. Experimental investigation of a 0.15-scale model of an underfuselage normal-shock inlet: NASA-CR-3049[R]. Washington, DC: National Aeronautics and Space Administration (NASA), 1978.
[62] IBRAHIM I H, NG E Y K, WONG L W K. Flight maneuverability characteristics of the F-16 CFD and correlation with its intake total pressure recovery and distortion[J]. Engineering Applications of Computational Fluid Mechanics, 2011, 5(2): 223-234.
[63] TRIANTAFYLLOU T, NIKOLAIDIS T, DIAKOSTEFANIS M, et al. Total pressure distortion levels at the aerodynamic interface plane of a military aircraft[J]. The Aeronautical Journal, 2015, 119(1219): 1147-1166.
[64] TRIANTAFLYLLOU T. The effect of inlet flow distortion on installed gas turbine performance[D]. Cranfield: Cranfield University, 2018: 155–162.
[65] JOSEPH A; MAV-JP. Falcon BMS 4.32 – HFFM Manual: Falcon BMS Technical Manual[R]. Benchmark Sims, 2011.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References