导航

Acta Aeronautica et Astronautica Sinica ›› 2025, Vol. 46 ›› Issue (S1): 732184.doi: 10.7527/S1000-6893.2025.32184

• Excellent Papers of the 2nd Aerospace Frontiers Conference/the 27th Annual Meeting of the China Association for Science and Technology • Previous Articles    

Design of reward functions for helicopter attitude control in reinforcement learning

Tao ZHANG, Pan LI(), Zixu WANG, Zhenhua ZHU   

  1. National Key Laboratory of Helicopter Dynamics,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China
  • Received:2025-02-25 Revised:2025-03-08 Accepted:2025-05-06 Online:2025-05-29 Published:2025-05-27
  • Contact: Pan LI E-mail:lipan@nuaa.edu.cn
  • Supported by:
    National Level Project

Abstract:

Design of the reward function is one of the core technologies for helicopter attitude control based on reinforcement learning, directly determining the training and performance of the controller. Designing a comprehensive and efficient reward function has become a key research topic in the field. To this end, a phased reward function framework is proposed, dividing the full-time domain control process into two control stages. Reward function sub-items are designed for each stage, while introducing adjustable parameters that allow macroscopic adjustment of control performance. Based on the Actor-Critic method, a simple neural network attitude controller structure is designed, and the Proximal Policy Optimization algorithm (PPO) is used for training. The effectiveness of the proposed method is validated through robustness tests involving sensor error introduction and comparative experiments with the baseline reward function. 100 step simulation trials show that compared to the baseline method, the number of cases where system steady-state error is less than 10% increases by 16%, the number of cases where system overshoot is less than 10% of the command amplitude increases by 9%, and the number of cases where system settling time is less than 4 s increases by 7%. Additionally, under conditions of significant sensor error, the controller can still successfully complete the attitude control task.

Key words: aircraft intelligent control, helicopter attitude control, reinforcement learning, reward function, neural networks

CLC Number: