航空学报 > 2023, Vol. 44 Issue (18): 328301-328301   doi: 10.7527/S1000-6893.2023.28301

针对集群攻击的飞行器智能协同拦截策略

高树一1, 林德福1, 郑多1(), 胡馨予2   

  1. 1.北京理工大学 宇航学院,北京 100081
    2.北京理工大学 徐特立学院,北京 100081
  • 收稿日期:2022-11-23 修回日期:2022-12-20 接受日期:2023-02-22 出版日期:2023-08-04 发布日期:2023-03-03
  • 通讯作者: 郑多 E-mail:zhengduohello@126.com
  • 基金资助:
    国家自然科学基金青年基金项目(61903350);教育部产学研创新项目(2021ZYA02002);北京理工大学青年教师学术启动计划(3010011182130)

Intelligent cooperative interception strategy of aircraft against cluster attack

Shuyi GAO1, Defu LIN1, Duo ZHENG1(), Xinyu HU2   

  1. 1.School of Aerospace Engineering,Beijing Institute of Technology,Beijing 100081,China
    2.XUTELI School,Beijing Institute of Technology,Beijing 100081,China
  • Received:2022-11-23 Revised:2022-12-20 Accepted:2023-02-22 Online:2023-08-04 Published:2023-03-03
  • Contact: Duo ZHENG E-mail:zhengduohello@126.com
  • Supported by:
    National Natural Science Foundation of China(61903350);Ministry of Education's industry-university-researchinnovation project(2021ZYA02002);Beijing Institute of Technology Research Fund Program for Young Scholars(3010011182130)

摘要:

无人集群间拦截博弈对抗是未来智能化战争的重要作战场景。针对飞行器集群攻击的协同拦截博弈对抗问题,提出了一种基于近端策略优化方法的多智能体深度强化学习协同拦截策略,将单智能体近端策略优化算法和集中式评价分布式执行算法架构相结合,设计了一种多智能体强化学习智能机动策略,在此基础上为解决算法收敛慢的问题,引入广义优势函数提升算法的收敛性能。仿真结果表明,多机智能协同拦截策略赋予飞行器自主学习的属性,能够根据实时战场态势智能自主分配拦截任务,且通过约束策略更新幅度提升了算法收敛速率。经过不断迭代自学习,能够实现拦截策略的自主优化,在不同的场景下自学习提升协同拦截效能。

关键词: 群目标协同拦截, 近端策略优化, 多智能体强化学习, 集中式评价-分布式执行, 深度学习

Abstract:

The attack defense confrontation and interception between unmanned clusters is an important operational scenario in the future intelligent war. Aiming at the problem of cooperative interception of game confrontation against aircraft cluster attacks, a multi-agent deep reinforcement learning cooperative interception strategy based on the near end strategy optimization method is proposed. Combining the single agent near end strategy optimization algorithm with the centralized evaluation distributed execution algorithm architecture, a multi-agent reinforcement learning intelligent maneuver strategy is designed. On this basis, to solve the problem of slow algorithm convergence, the generalized dominance function is introduced to improve the convergence performance of the algorithm. Simulation results show that the multi aircraft intelligent cooperative interception strategy endows the UAV with the attribute of autonomous learning, which can intelligently and autonomously assign interception tasks according to the real-time battlefield situation, and improves the algorithm convergence rate by constraining the update range of the strategy. Through continuous iterative self-learning, this strategy can realize the autonomous optimization of game interception strategy. Improve collaborative interception efficiency by self-learning in different scenarios.

Key words: multi-target cooperative interception, proximal policy optimization, multi-agent reinforcement learning, centralized evaluation-distributed execution, deep learning

中图分类号: