导航

Acta Aeronautica et Astronautica Sinica ›› 2024, Vol. 45 ›› Issue (22): 330195.doi: 10.7527/S1000-6893.2024.30195

• Electronics and Electrical Engineering and Control • Previous Articles     Next Articles

Greedy-PPO intelligent spectrum sharing decision for complex electromagnetic interference environments

Kaijie YIN1, Jia SHI1(), Guodong DUAN2, Lixin LI3, Jiangbo SI1   

  1. 1.School of Telecommunications Engineering,Xidian University,Xi’an 710071,China
    2.Southwest China Research Institute of Electronic Equipment,Chengdu 610036,China
    3.School of Electronics and lnformation,Northwestern Polytechnical University,Xi’an 710129,China
  • Received:2024-01-19 Revised:2024-02-05 Accepted:2024-02-29 Online:2024-11-25 Published:2024-03-11
  • Contact: Jia SHI E-mail:jiashi@xidian.edu.cn
  • Supported by:
    Key Laboratory Fund for Electromagnetic Space Operations and Applications(JJ2021-001)

Abstract:

Considering the challenge of continuous and discrete hybrid action coupling decision-making, an intelligent spectrum sharing technology based on reinforcement learning is studied to solve the problem of intense frequency conflict of multi-functional electromagnetic equipment in complex electromagnetic environment. Firstly, considering the influence of many factors such as the frequency rules of the own side and the jamming side, a sophisticated model of the complex electromagnetic interference environment is developed. Based on this, a spectrum sharing efficiency evaluation index for radar communication integrated equipment under multitask requirements is designed. Secondly, a Greedy Proximal Policy Optimization (Greedy-PPO) intelligent spectrum sharing decision algorithm is proposed, which decouples the discrete continuous action space and uses the PPO method to optimize the allocation of transmission power. Then, the Greedy method is employed to solve the problem of spectrum discrete optimization allocation and obtain an approximately optimal joint spectrum sharing strategy. Finally, through simulation experiments, it is verified that the Greedy PPO algorithm can improve the overall performance by 48% and 15% compared to greedy algorithms and DDQN algorithms, respectively, demonstrating excellent performance of spectrum utilization.

Key words: spectrum sharing, reinforcement learning, rule algorithm, decision management, hybrid action space

CLC Number: