Electronics and Electrical Engineering and Control

Greedy-PPO intelligent spectrum sharing decision for complex electromagnetic interference environments

  • Kaijie YIN ,
  • Jia SHI ,
  • Guodong DUAN ,
  • Lixin LI ,
  • Jiangbo SI
Expand
  • 1.School of Telecommunications Engineering,Xidian University,Xi’an 710071,China
    2.Southwest China Research Institute of Electronic Equipment,Chengdu 610036,China
    3.School of Electronics and lnformation,Northwestern Polytechnical University,Xi’an 710129,China

Received date: 2024-01-19

  Revised date: 2024-02-05

  Accepted date: 2024-02-29

  Online published: 2024-03-11

Supported by

Key Laboratory Fund for Electromagnetic Space Operations and Applications(JJ2021-001)

Abstract

Considering the challenge of continuous and discrete hybrid action coupling decision-making, an intelligent spectrum sharing technology based on reinforcement learning is studied to solve the problem of intense frequency conflict of multi-functional electromagnetic equipment in complex electromagnetic environment. Firstly, considering the influence of many factors such as the frequency rules of the own side and the jamming side, a sophisticated model of the complex electromagnetic interference environment is developed. Based on this, a spectrum sharing efficiency evaluation index for radar communication integrated equipment under multitask requirements is designed. Secondly, a Greedy Proximal Policy Optimization (Greedy-PPO) intelligent spectrum sharing decision algorithm is proposed, which decouples the discrete continuous action space and uses the PPO method to optimize the allocation of transmission power. Then, the Greedy method is employed to solve the problem of spectrum discrete optimization allocation and obtain an approximately optimal joint spectrum sharing strategy. Finally, through simulation experiments, it is verified that the Greedy PPO algorithm can improve the overall performance by 48% and 15% compared to greedy algorithms and DDQN algorithms, respectively, demonstrating excellent performance of spectrum utilization.

Cite this article

Kaijie YIN , Jia SHI , Guodong DUAN , Lixin LI , Jiangbo SI . Greedy-PPO intelligent spectrum sharing decision for complex electromagnetic interference environments[J]. ACTA AERONAUTICAET ASTRONAUTICA SINICA, 2024 , 45(22) : 330195 -330195 . DOI: 10.7527/S1000-6893.2024.30195

References

1 金宁. 美军电磁频谱战理念发展及能力建设现状探析[J]. 军事文摘2022, (17): 7-10.
  JIN N. Analysis of the development and capacity building of the US electromagnetic spectrum warfare concept[J]. Military Digest2022, (17): 7-10 (in Chinese).
2 丁国如, 孙佳琛, 王海超, 等. 复杂电磁环境下频谱智能管控技术探讨[J]. 航空学报202142(4): 524750.
  DING G R, SUN J C, WANG H C, et al. Discussion on technologies for intelligent spectrum management and control under complex electromagnetic environments[J]. Acta Aeronautica et Astronautica Sinica202142(4): 524750 (in Chinese).
3 龙晓波, 张圣鹋, 余晨, 等. 复杂适应性系统-电磁频谱战的解决之道[J]. 中国电子科学研究院学报202217(11): 1037-1041, 1056.
  LONG X B, ZHANG S M, YU C, et al. Complex adaptive system-the solution of electromagnetic spectrum warfare[J]. Journal of China Academy of Electronics and Information Technology202217(11): 1037-1041, 1056 (in Chinese).
4 刘东, 吴启晖, Tony Q. S. Quek. 面向航空6G的频谱认知智能管控[J]. 物联网学报20204(1): 12-18.
  LIU D, WU Q H, TONY Q S Q. Spectrum cognitive intelligent management and control for aviation 6G[J]. Chinese Journal on Internet of Things20204(1): 12-18 (in Chinese).
5 彭沛, 李震. 战场频率管理方法梳理探究[J]. 数字通信世界2017(9): 42-43.
  PENG P, LI Z. Exploration and sorting of battlefield frequency management methods[J]. Digital Communication World2017(9): 42-43 (in Chinese).
6 刘鹏, 张国翊, 舒放, 等. 基于图论的认知无线网络频谱动态分配[J]. 电讯技术202060(6): 625-631.
  LIU P, ZHANG G Y, SHU F, et al. Dynamic spectrum allocation in cognitive radio networks based on graph theory[J]. Telecommunication Engineering202060(6): 625-631 (in Chinese).
7 周健. 高密度网络中基于图论的快速频谱分配方案研究[D]. 合肥: 合肥工业大学, 2018.
  ZHOU J. Research on fast spectrum allocation scheme based on graph theory in high density network[D]. Hefei: Hefei University of Technology, 2018 (in Chinese).
8 程启明. 基于改进敏感图着色算法的认知无线电频谱分配研究[D]. 成都: 西南交通大学, 2016.
  CHENG Q M. Research on spectrum allocation of cognitive radio based on improved sensitive graph coloring algorithm[D].Chengdu: Southwest Jiaotong University, 2016 (in Chinese).
9 韩志豪, 赵东来, 王钢. 超密集网络中基于博弈论的频谱分配策略研究[J]. 无线电工程202151(1): 19-24.
  HAN Z H, ZHAO D L, WANG G. Research on spectrum allocation strategy based on game theory in ultra dense network[J]. Radio Engineering202151(1): 19-24 (in Chinese).
10 ZHANG L, XIE J L, CHEN Y M. Cognitive spectrum sharing algorithm based on secondary users grouping[C]∥2020 International Conference on Robots & Intelligent System (ICRIS). Piscataway: IEEE Press, 2020: 564-568.
11 SUREKHA S, RAHMAN M Z U. Spectrum sensing and allocation strategy for IoT devices using continuous-time Markov chain-based game theory model[J]. IEEE Sensors Letters20226(4): 5500504.
12 TRAN Q N, VO N S, BUI M P, et al. Spectrum sharing and power allocation optimised multihop multipath D2D video delivery in beyond 5G networks[J]. IEEE Transactions on Cognitive Communications and Networking20228(2): 919-930.
13 孙汉卿, 刘征, 王桂芝, 等. 基于多态蚁群优化算法的认知无线电频谱分配[J]. 计算机应用与软件202037(12): 260-265, 321.
  SUN H Q, LIU Z, WANG G Z, et al. Cognitive radio spectrum allocation based on improved polymorphic ant colony algorithm[J]. Computer Applications and Software202037(12): 260-265, 321 (in Chinese).
14 赵显煜, 王俊, 邢新华. 基于改进蚁群算法的认知无线电频谱分配的策略研究[J]. 通信技术202053(10): 2454-2460.
  ZHAO X Y, WANG J, XING X H. Cognitive radio spectrum allocation strategy based on modified ant colony algorithm[J]. Communications Technology202053(10): 2454-2460 (in Chinese).
15 苏慧慧, 彭艺, 曲文博. 基于疯狂自适应鱼群算法的认知无线电频谱分配[J]. 应用科学学报202038(6): 882-889.
  SU H H, PENG Y, QU W B. Cognitive radio spectrum allocation based on crazy adaptive fish swarm algorithm[J]. Journal of Applied Sciences202038(6): 882-889 (in Chinese).
16 ZLOBINSKY N, JOHNSON D L, MISHRA A K, et al. Comparison of metaheuristic algorithms for interface-constrained channel assignment in a hybrid dynamic spectrum access-Wi-Fi infrastructure WMN[J]. IEEE Access202210: 26654-26680.
17 WANG W B, KWASINSKI A, NIYATO D, et al. A survey on applications of model-free strategy learning in cognitive wireless networks[J]. IEEE Communications Surveys & Tutorials201618(3): 1717-1757.
18 WANG Y H, YE Z F, WAN P, et al. A survey of dynamic spectrum allocation based on reinforcement learning algorithms in cognitive radio networks[J]. Artificial Intelligence Review201951(3): 493-506.
19 王倩, 聂秀山, 耿蕾蕾, 等. D2D通信中基于Q学习的联合资源分配与功率控制算法[J]. 南京大学学报(自然科学)201854(6): 1183-1192.
  WANG Q, NIE X S, GENG L L, et al. Joint resource allocation and power control strategy based on Q-Learning method in cellular D2D network[J]. Journal of Nanjing University (Natural Science)201854(6): 1183-1192 (in Chinese).
20 MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature2015518: 529-533.
21 FAN Y X, HUANG J X, WANG X Y, et al. Resource allocation for V2X assisted automotive radar system based on reinforcement learning[C]∥2022 14th International Conference on Wireless Communications and Signal Processing (WCSP). Piscataway: IEEE Press, 2022: 672-676.
22 ZHANG Z B, CHANG Q, YANG S Z, et al. Sensing-communication bandwidth allocation in vehicular links based on reinforcement learning[J]. IEEE Wireless Communications Letters202312(1): 11-15.
23 BIRHANU ENGIDAYEHU S, MAHBOOB T, YOUNG CHUNG M. Deep reinforcement learning-based task offloading and resource allocation in MEC-enabled wireless networks[C]∥ 2022 27th Asia Pacific Conference on Communications (APCC). Piscataway: IEEE Press, 2022: 226-230.
24 REN J, XU S. DDPG based computation offloading and resource allocation for MEC systems with energy harvesting[C]∥2021 IEEE 93rd Vehicular Technology Conference (VTC2021-Spring). Piscataway: IEEE Press, 2021: 1-5.
25 李佳琪. 雷达电磁环境智能认知方法研究[D]. 西安: 西安电子科技大学, 2022.
  LI J Q. Research on radar electromagnetic ambient intelligence cognitive method[D]. Xi’an: Xidian University, 2022 (in Chinese).
26 HUANGI R, SI J B, SHI J, et al. Deep-reinforcement-learning-based resource allocation in ultra-dense network[C]∥2021 13th International Conference on Wireless Communications and Signal Processing (WCSP). Piscataway: IEEE Press, 2021: 1-5.
27 赵嘉荣. 雷达辅助的通信感知一体化关键技术研究[D]. 成都: 电子科技大学, 2023.
  ZHAO J R. Research on key technologies of radar-assisted communication perception integration[D]. Chengdu: University of Electronic Science and Technology of China, 2023 (in Chinese).
28 SCHULMAN J, LEVINE S, MORITZ P, et al. Trust region policy optimization[DB/OL]. arXiv preprint: 1502.05477, 2015.
29 XU T Y, ZOU S F, LIANG Y B. Two time-scale off-policy TD learning: Non-asymptotic analysis over Markovian samples[DB/OL]. arXiv preprint1909.11907, 2019.
Outlines

/