﻿ 相控阵雷达长时跟踪波束调度与波形优化策略
 文章快速检索 高级检索

Long time tracking beam scheduling and waveform optimization strategy for phased array radar
LIU Yiming, SHENG Wen, HU Bing, ZHANG Lei
Air Defense Early Warning Equipment Department, Air Force Early Warning Academy, Wuhan 430019, China
Abstract: Aiming at the problem of multi-target tracking beam scheduling and waveform parameter optimization control of phased array radar, a strategy of tracking beam scheduling and waveform parameter optimization based on Markov Decision Process (MDP) is proposed. The Unscented Kalman Filter (UKF) algorithm is used to estimate the state of the target. Firstly, the sequence decision problem of this paper is modeled as a Markov decision process, and the cost-effectiveness ratio and the long-term return rate of the resource are defined. Then, the current actual tracking error is intigrated as the reward function of MDP, and the optimization model of joint scheduling is given. Finally, the long-term decision problem is transformed into a dynamic programming algorithm structure, and a parallel hybrid genetic particle swarm optimization algorithm is proposed to solve the optimal strategy at each decision time. The simulation result shows the advanced nature of the strategy and the superiority of the optimization algorithm. Compared with the traditional "short-term" strategy, the tracking accuracy can be improved by 11.17%.
Keywords: phased array radar    beam scheduling    waveform parameter optimization    Markov decision process    Unscented Kalman Filter (UKF)    long-term return rate    hybrid genetic particle swarm optimization

1 基于MDP的跟踪波束波形调度模型

 图 1 tk时刻长期决策过程示意图 Fig. 1 Schematic diagram of long-term decision-making process at tk time
1.1 调度动作

1.2 系统状态与状态转移

 ${\mathit{\boldsymbol{X}}_{k + 1}} = {\mathit{\boldsymbol{F}}_k}{\mathit{\boldsymbol{X}}_k} + {\mathit{\boldsymbol{\omega }}_k}$ （1）

1.3 系统观测及观测矩阵

 ${\mathit{\boldsymbol{Z}}_{k + 1}} = H\left( {{\mathit{\boldsymbol{X}}_{k + 1}}} \right) + {\mathit{\boldsymbol{v}}_k}$ （2）

1.4 回报函数

 $E\left\{ {\left( {{{\mathit{\boldsymbol{\hat X}}}_k} - {\mathit{\boldsymbol{X}}_k}} \right){{\left( {{{\mathit{\boldsymbol{\hat X}}}_k} - {\mathit{\boldsymbol{X}}_k}} \right)}^{\rm{T}}}} \right\} \ge \mathit{\boldsymbol{J}}_k^{ - 1} = {\mathit{\boldsymbol{C}}_k}$ （3）

 $\left\{ \begin{array}{l} {\mathit{\boldsymbol{J}}_{k + 1}} = \mathit{\boldsymbol{D}}_k^{22} - \mathit{\boldsymbol{D}}_k^{21}{\left( {{\mathit{\boldsymbol{J}}_k} + \mathit{\boldsymbol{D}}_k^{12}} \right)^{ - 1}}\\ \mathit{\boldsymbol{D}}_k^{11} = \mathit{\boldsymbol{F}}_k^{\rm{T}}\mathit{\boldsymbol{Q}}_k^{ - 1}{\mathit{\boldsymbol{F}}_k}\\ \mathit{\boldsymbol{D}}_k^{12} = - \mathit{\boldsymbol{F}}_k^{\rm{T}}\mathit{\boldsymbol{Q}}_k^{ - 1} = {\left( {\mathit{\boldsymbol{D}}_k^{21}} \right)^{\rm{T}}}\\ \mathit{\boldsymbol{D}}_k^{22} = \mathit{\boldsymbol{Q}}_k^{ - 1} + \mathit{\boldsymbol{G}}_{k + 1}^{\rm{T}}\mathit{\boldsymbol{R}}_{k + 1}^{ - 1}{\mathit{\boldsymbol{G}}_{k + 1}} \end{array} \right.$ （4）

 ${E_k} = {d_k}/{T_k}$ （5）

 $\left\{ \begin{array}{l} {B_k} = \sqrt {Tr\left( {\mathit{\boldsymbol{ \boldsymbol{\varLambda} }}{\mathit{\boldsymbol{C}}_k}{\mathit{\boldsymbol{ \boldsymbol{\varLambda} }}^{\rm{T}}}} \right)} \\ \mathit{\boldsymbol{ \boldsymbol{\varLambda} }} = {\rm{blkdiag}}\left( {{\mathit{\boldsymbol{I}}_{\rm{m}}} \otimes \left[ {\begin{array}{*{20}{c}} 1&0\\ 0&{{T_k}} \end{array}} \right]} \right)\\ \Delta {A_k} = \left( {{B_{k + 1}} - {B_k}} \right)/{B_k} \end{array} \right.$ （6）

 ${\eta _k} = \Delta {A_k}/{E_k}$ （7）

 $\begin{array}{l} r\left( {{\mathit{\boldsymbol{X}}_k},{\mathit{\boldsymbol{a}}_k}} \right) = \\ \;\;\;\;\;\;\left\{ {\begin{array}{*{20}{l}} {\eta \left( {{\mathit{\boldsymbol{X}}_k},{\mathit{\boldsymbol{a}}_k}} \right)F\left[ {\mathit{\boldsymbol{P}}\left( {{a_{k{\rm{p}}}}} \right)} \right]}&{F\left[ {\mathit{\boldsymbol{P}}\left( {{a_{k{\rm{p}}}}} \right)} \right] < {P_{{\rm{thr}}}}}\\ {{\kappa _k}F\left[ {\mathit{\boldsymbol{P}}\left( {{a_{k{\rm{p}}}}} \right)} \right]}&{F\left[ {\mathit{\boldsymbol{P}}\left( {{a_{k{\rm{p}}}}} \right)} \right] \ge {P_{{\rm{thr}}}}} \end{array}} \right. \end{array}$ （8）

 $R\left( {{\mathit{\boldsymbol{X}}_k},{\mathit{\boldsymbol{a}}_k},{\mathit{\boldsymbol{a}}_{k + 1}}, \cdots ,{\mathit{\boldsymbol{a}}_{k + n}},n} \right) = \sum\limits_{\tau = 0}^{n - 1} {{\alpha _\tau }} r\left( {{\mathit{\boldsymbol{X}}_k},{\mathit{\boldsymbol{a}}_{k + \tau }}} \right)$ （9）

2 问题模型建立 2.1 目标跟踪滤波算法

 $\left\{ \begin{array}{l} {{\mathit{\boldsymbol{\hat X}}}_{k|k - 1}} = {\mathit{\boldsymbol{F}}_{k - 1}}{{\mathit{\boldsymbol{\hat X}}}_{k - 1|k - 1}}\\ {\mathit{\boldsymbol{P}}_{k|k - 1}} = {\mathit{\boldsymbol{F}}_{k - 1}}{\mathit{\boldsymbol{P}}_{k - 1|k - 1}}\mathit{\boldsymbol{F}}_{k - 1}^{\rm{T}} + {\mathit{\boldsymbol{Q}}_{k - 1}} \end{array} \right.$ （10）

 $\begin{array}{l} \mathit{\boldsymbol{X}}_{k|k - 1}^{(i)} = \left\{ {\begin{array}{*{20}{l}} {{{\mathit{\boldsymbol{\hat X}}}_{k|k - 1}}}&{i = 0}\\ {{{\mathit{\boldsymbol{\hat X}}}_{k|k - 1}} + \sqrt {(L + \lambda ){\mathit{\boldsymbol{P}}_{k|k - 1}}} }&{i = 1,2, \cdots ,L}\\ {{{\mathit{\boldsymbol{\hat X}}}_{k|k - 1}} - \sqrt {(L + \lambda ){\mathit{\boldsymbol{P}}_{k|k - 1}}} }&{i = L + 1,L + 2, \cdots ,2L} \end{array}} \right.\\ \left\{ {\begin{array}{*{20}{l}} {W_{\rm{m}}^{(0)} = \lambda /(L + \lambda )}\\ {W_{\rm{c}}^{(0)} = \lambda /(L + \lambda ) + \left( {1 - {\alpha ^2} + \beta } \right)}\\ {W_{\rm{m}}^{(i)} = W_{\rm{c}}^{(i)} = 1/[2(L + \lambda )]\;\;\;\;i = 1,2, \cdots ,2L} \end{array}} \right. \end{array}$ （11）

 $\left\{ \begin{array}{l} {{\mathit{\boldsymbol{\hat Z}}}_{k|k - 1}} = \sum\limits_{i = 1}^{2L} {W_{\rm{m}}^{(i)}} H\left( {\mathit{\boldsymbol{X}}_{k|k - 1}^{(i)}} \right)\\ {\mathit{\boldsymbol{P}}_{{\mathit{\boldsymbol{Z}}_k}{\mathit{\boldsymbol{Z}}_k}}} = \sum\limits_{i = 0}^{2L} {\left[ {W_{\rm{c}}^{(i)}\left( {H\left( {\mathit{\boldsymbol{X}}_{k|k - 1}^{(i)}} \right) - {{\mathit{\boldsymbol{\hat Z}}}_{k|k - 1}}} \right) \cdot } \right.} \\ \;\;\;\;\;\;\;\;\left. {{{\left( {H\left( {\mathit{\boldsymbol{X}}_{k|k - 1}^{(i)}} \right) - {{\mathit{\boldsymbol{\hat Z}}}_{k|k - 1}}} \right)}^{\rm{T}}}} \right] + {\mathit{\boldsymbol{R}}_k}\\ {\mathit{\boldsymbol{P}}_{{\mathit{\boldsymbol{X}}_k}{\mathit{\boldsymbol{Z}}_k}}} = \sum\limits_{i = 0}^{2L} {\left[ {W_{\rm{c}}^{(i)}\left( {\mathit{\boldsymbol{X}}_{k|k - 1}^{(i)} - {{\hat X}_{k|k - 1}}} \right) \cdot } \right.} \\ \;\;\;\;\;\;\;\;\left. {{{\left( {H\left( {\mathit{\boldsymbol{X}}_{k|k - 1}^{(i)}} \right) - {{\mathit{\boldsymbol{\hat Z}}}_{k|k - 1}}} \right)}^{\rm{T}}}} \right] \end{array} \right.$ （12）

 $\left\{ \begin{array}{l} {\mathit{\boldsymbol{K}}_k} = {\mathit{\boldsymbol{P}}_{{\mathit{\boldsymbol{X}}_k}{\mathit{\boldsymbol{Z}}_k}}}\mathit{\boldsymbol{P}}_{{\mathit{\boldsymbol{Z}}_k}{\mathit{\boldsymbol{Z}}_k}}^{ - 1}\\ {{\mathit{\boldsymbol{\hat X}}}_{k|k}} = {{\mathit{\boldsymbol{\hat X}}}_{k|k - 1}} + {\mathit{\boldsymbol{K}}_k}\left( {{\mathit{\boldsymbol{Z}}_k} - {{\mathit{\boldsymbol{\hat Z}}}_{k|k - 1}}} \right)\\ {\mathit{\boldsymbol{P}}_{k|k}} = {\mathit{\boldsymbol{P}}_{k|k - 1}} - {\mathit{\boldsymbol{K}}_k}{\mathit{\boldsymbol{P}}_{{\mathit{\boldsymbol{Z}}_k}{\mathit{\boldsymbol{Z}}_k}}}\mathit{\boldsymbol{K}}_k^{\rm{T}} \end{array} \right.$ （13）
2.2 相控阵雷达长时调度策略

 $\begin{array}{l} A_{k + n}^{{\rm{opt}}} = \arg \mathop {\max }\limits_{{\mathit{\boldsymbol{A}}_{k + n}}} \left[ {R\left( {{\mathit{\boldsymbol{X}}_k},{\mathit{\boldsymbol{a}}_k}, \cdots ,{\mathit{\boldsymbol{a}}_{k + n}},n} \right)} \right]\\ {\rm{s}}.\;{\rm{t}}.\;\left\{ {\begin{array}{*{20}{l}} {P_{\rm{d}}^{k + i} \ge {P_{{\rm{dmin }}}}}\\ {F\left( {\mathit{\boldsymbol{P}}_{\rm{e}}^{k + i}} \right) \le {P_{{\rm{thr }}}}} \end{array}} \right.i = 1,2, \cdots ,n \end{array}$ （14）

2.3 策略实现流程

 图 2 长时调度策略最优控制流程 Fig. 2 Optimal control flow of long-term scheduling strategy

1) 多步预测的波束调度效费比计算。由式(14)可知，需要在tk时刻预测i步的检测概率Pdk+i和误差协方差矩阵Pek+i，其中i=1, 2, …，n，进而计算出各目标的最优效费比和其对应跟踪波形参数。由于目标在雷达预测时间内距离变化很小，可以利用rk来近似rpk+i，则给定驻留时间序列τpk=[τpk+1, τpk+2, …, τpk+n]条件下可得回波信噪比SNRpk+i

 $\left\{ \begin{array}{l} {\rm{SNR}}_{\rm{p}}^{k + i} = {\rm{SN}}{{\rm{R}}_{{\rm{ref}}}}\left( {\tau _{\rm{p}}^{k + i}/{\tau _{{\rm{ref}}}}} \right){\left( {r_{\rm{p}}^{k + i}/{r_{{\rm{ref}}}}} \right)^{ - 4}}\\ r_{\rm{p}}^{k + i} \approx {r_k} = \sqrt {{{\left( {{x_k} - {x_{\rm{p}}}} \right)}^2} + {{\left( {{y_k} - {y_{\rm{p}}}} \right)}^2}} \end{array} \right.$ （15）

 $\begin{array}{l} P_{\rm{d}}^{k + i} = \\ \;\;\;\;\;\;\exp \left( {\frac{{2\ln {P_{\rm{f}}}}}{{2 + {\rm{SNR}}_{\rm{p}}^{k + i}}}} \right)\left[ {1 - \frac{{2{\rm{SNR}}_{\rm{p}}^{k + i}\ln {P_{\rm{f}}}}}{{{{\left( {2 + {\rm{SNR}}_{\rm{p}}^{k + i}} \right)}^2}}}} \right] \end{array}$ （16）

2) 多步预测的回报函数值计算。将目标tk时刻i步预测误差协方差矩阵Pek+i中位置分量估计误差提取后并计算相应的F范数，即

 $F\left( {\mathit{\boldsymbol{P}}_{\rm{e}}^{k + i}} \right) = {\left\| {\mathit{\boldsymbol{P}}_{\rm{e}}^{k + i}} \right\|_{\rm{F}}} = \sqrt {\mathit{\boldsymbol{P}}_{\rm{e}}^{k + i}\left( {1,1} \right) + \mathit{\boldsymbol{P}}_{\rm{e}}^{k + i}\left( {3,3} \right)}$ （17）

2.4 混合优化算法

 $\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}} \over R} \left( y \right) = R\left( y \right) - {c_{\min }}$ （18）

 $\left\{ \begin{array}{l} v_w^{ij}\left( {t + 1} \right) = \omega v_w^{ij}\left( t \right) + {c_1}{r_1}\left[ {p_{{\rm{best}}}^{ij}\left( t \right) - x_w^{ij}\left( t \right)} \right] + \\ \;\;\;\;\;\;\;\;{c_2}{r_2}\left[ {g_{{\rm{best}}}^j\left( t \right) - x_w^{ij}\left( t \right)} \right]\\ x_w^{ij}\left( {t + 1} \right) = x_w^{ij}\left( t \right) + v_w^{ij}\left( {t + 1} \right) \end{array} \right.$ （19）

 图 3 并行混合GAPSO算法实现流程图 Fig. 3 Flow chart of parallel hybrid GAPSO algorithm implementation

1) 选择算子。选择又称为复制，是在群体中选择生命力强的个体产生新群体的过程，利用比例选择方法，则个体i被选则概率Pi

 ${P_i} = \mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}} \over R} \left( {\mathit{\boldsymbol{Y}}_{\rm{w}}^i} \right)/\sum\limits_i^N {\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\frown$}} \over R} \left( {\mathit{\boldsymbol{Y}}_{\rm{w}}^i} \right)}$ （20）

2) 交叉算子。交叉算子是个体间交配重组的过程，对于实数编码方式，常采用算术交叉方式进行，假设在2个体YwA(t)和YwB(t)之间交叉，则产生的2个新个体为

 $\left\{ {\begin{array}{*{20}{l}} {Y_{\rm{w}}^{\rm{A}}\left( {t + 1} \right) = \alpha Y_{\rm{w}}^{\rm{B}}\left( t \right) + \left( {1 - \alpha } \right)Y_{\rm{w}}^{\rm{A}}\left( t \right)}\\ {Y_{\rm{w}}^{\rm{B}}\left( {t + 1} \right) = \alpha Y_{\rm{w}}^{\rm{A}}\left( t \right) + \left( {1 - \alpha } \right)Y_{\rm{w}}^{\rm{B}}\left( t \right)} \end{array}} \right.$ （21）

3) 变异算子。变异算子是个体间染色体等位基因替换的过程，为了增加种群的多样性，本文采用均匀变异的方式，其操作过程描述为：依次指定个体中每个基因座为变异点，对每个变异点，以变异概率Pm从对应基因的取值范围内取一随机数代替原来的基因值。

3 仿真实验

3.1 对比方法介绍

3.2 仿真场景

 图 4 责任区内跟踪目标运动轨迹 Fig. 4 Tracking target motion trajectory in area of responsibility
3.3 仿真结果

 图 5 不同方法下各目标RMSE Fig. 5 RMSE of each target under different methods
 图 6 不同方法下调度过程平均消耗资源 Fig. 6 Average consumption of resources in scheduling process under different methods

 方法 均方误差/m 驻留时间/s 采样间隔/s 决策时长/s 预测步长 n=2 327.82 0.02293 1.1538 5.6698 n=3 319.17 0.03008 1.1935 7.4832 n=4 305.80 0.03355 1.2172 9.5013 n=5 292.60 0.03496 1.2280 12.507 n=6 306.74 0.03534 1.2265 15.445 方法1 329.41 0.01542 1.1072 3.9864 方法2 335.08 0.02044 1.4823 1.1809

 图 7 RMSE和决策时长随预测步长的变化曲线 Fig. 7 RMSE and decision duration as a function of predicted step size

 图 8 3种方法下目标1调度过程RMSE变化情况 Fig. 8 Changes in RMSE of target 1 scheduling process under three methods

 图 9 预测步长n=5时调度过程波束照射情况 Fig. 9 Scheduling process beam illumination when predicting step size n=5
4 结论

1) 可以在各决策时刻选取最为合适的目标进行调度，在保证跟踪精度的同时，适当增大了跟踪驻留时间和采样间隔时间，提高了时间资源利用率。

2) 在调度过程中，能够及时对跟踪精度超门限目标进行调度，有效提高了跟踪目标容量，降低了失跟率。

3) 存在最优步长，同时，调度性能的提高是以牺牲决策实时性为代价，实际运用过程中决策者需要在性能和实时性之间进行权衡，进而选取合适的预测步长。

4) 为解决波束波形联合调度问题提供了一个很好的理论框架，具有良好地拓展性，可解决多目标决策问题。

http://dx.doi.org/10.7527/S1000-6893.2019.23519

0

#### 文章信息

LIU Yiming, SHENG Wen, HU Bing, ZHANG Lei

Long time tracking beam scheduling and waveform optimization strategy for phased array radar

Acta Aeronautica et Astronautica Sinica, 2020, 41(3): 323519.
http://dx.doi.org/10.7527/S1000-6893.2019.23519