Documentation |
On this page… |
---|
Types of Recursive Estimation Algorithms |
You can choose from the following four recursive estimation algorithms:
You specify the type of recursive estimation algorithms as arguments in the recursive estimation commands.
For detailed information about these algorithms, see the corresponding chapter in System Identification: Theory for the User by Lennart Ljung (Prentice Hall PTR, Upper Saddle River, NJ, 1999).
The general recursive identification algorithm is given by the following equation:
$$\widehat{\theta}\left(t\right)=\widehat{\theta}\left(t-1\right)+K\left(t\right)\left(y\left(t\right)-\widehat{y}\left(t\right)\right)$$
$$\widehat{\theta}\left(t\right)$$ is the parameter estimate at time t. y(t) is the observed output at time t and $$\widehat{y}\left(t\right)$$ is the prediction of y(t) based on observations up to time t-1. The gain, K(t), determines how much the current prediction error $$y\left(t\right)-\widehat{y}\left(t\right)$$ affects the update of the parameter estimate. The estimation algorithms minimize the prediction-error term $$y\left(t\right)-\widehat{y}\left(t\right)$$.
The gain has the following general form:
$$K\left(t\right)=Q\left(t\right)\psi \left(t\right)$$
The recursive algorithms supported by the System Identification Toolbox™ product differ based on different approaches for choosing the form of Q(t) and computing $$\psi \left(t\right)$$, where $$\psi \left(t\right)$$ represents the gradient of the predicted model output $$\widehat{y}\left(t|\theta \right)$$ with respect to the parameters $$\theta $$.
The simplest way to visualize the role of the gradient $$\psi \left(t\right)$$ of the parameters, is to consider models with a linear-regression form:
$$y\left(t\right)={\psi}^{T}\left(t\right){\theta}_{0}\left(t\right)+e\left(t\right)$$
In this equation, $$\psi \left(t\right)$$ is the regression vector that is computed based on previous values of measured inputs and outputs. $${\theta}_{0}\left(t\right)$$ represents the true parameters. e(t) is the noise source (innovations), which is assumed to be white noise. The specific form of $$\psi \left(t\right)$$ depends on the structure of the polynomial model.
For linear regression equations, the predicted output is given by the following equation:
$$\widehat{y}\left(t\right)={\psi}^{T}\left(t\right)\widehat{\theta}\left(t-1\right)$$
For models that do not have the linear regression form, it is not possible to compute exactly the predicted output and the gradient $$\psi \left(t\right)$$ for the current parameter estimate $$\widehat{\theta}\left(t-1\right)$$. To learn how you can compute approximation for $$\psi \left(t\right)$$ and $$\widehat{\theta}\left(t-1\right)$$ for general model structures, see the section on recursive prediction-error methods in System Identification: Theory for the User by Lennart Ljung (Prentice Hall PTR, Upper Saddle River, NJ, 1999).
The following set of equations summarizes the Kalman filter adaptation algorithm:
$$\widehat{\theta}\left(t\right)=\widehat{\theta}\left(t-1\right)+K\left(t\right)\left(y\left(t\right)-\widehat{y}\left(t\right)\right)$$
$$\widehat{y}\left(t\right)={\psi}^{T}\left(t\right)\widehat{\theta}\left(t-1\right)$$
$$K\left(t\right)=Q\left(t\right)\psi \left(t\right)$$
$$Q\left(t\right)=\frac{P\left(t-1\right)\psi \left(t\right)}{{R}_{2}+{\psi}^{T}\left(t\right)P\left(t-1\right)\psi \left(t\right)}$$
$$P\left(t\right)=P\left(t-1\right)+{R}_{1}-\frac{P\left(t-1\right)\psi \left(t\right)\psi {\left(t\right)}^{T}P\left(t-1\right)}{{R}_{2}+\psi {\left(t\right)}^{T}P\left(t-1\right)\psi \left(t\right)}$$
This formulation assumes the linear-regression form of the model:
$$y\left(t\right)={\psi}^{T}\left(t\right){\theta}_{0}\left(t\right)+e\left(t\right)$$
The Kalman filter is used to obtain Q(t).
This formulation also assumes that the true parameters $${\theta}_{0}\left(t\right)$$ are described by a random walk:
$${\theta}_{0}\left(t\right)={\theta}_{0}\left(t-1\right)+w\left(t\right)$$
w(t) is Gaussian white noise with the following covariance matrix, or drift matrix R_{1}:
$$Ew\left(t\right){w}^{T}\left(t\right)={R}_{1}$$
R_{2} is the variance of the innovations e(t) in the following equation:
$$y\left(t\right)={\psi}^{T}\left(t\right){\theta}_{0}\left(t\right)+e\left(t\right)$$
The Kalman filter algorithm is entirely specified by the sequence of data y(t), the gradient $$\psi \left(t\right)$$, R_{1}, R_{2}, and the initial conditions $$\theta \left(t=0\right)$$ (initial guess of the parameters) and $$P\left(t=0\right)$$ (covariance matrix that indicates parameters errors).
Note: To simplify the inputs, you can scale R_{1}, R_{2}, and $$P\left(t=0\right)$$ of the original problem by the same value such that R_{2} is equal to 1. This scaling does not affect the parameters estimates. |
The general syntax for the command described in Algorithms for Recursive Estimation is the following:
[params,y_hat]=command(data,nn,adm,adg)
To specify the Kalman filter algorithm, set adm to 'kf' and adg to the value of the drift matrix R_{1} (described in Mathematics of the Kalman Filter Algorithm).
The following set of equations summarizes the forgetting factor adaptation algorithm:
$$\widehat{\theta}\left(t\right)=\widehat{\theta}\left(t-1\right)+K\left(t\right)\left(y\left(t\right)-\widehat{y}\left(t\right)\right)$$
$$\widehat{y}\left(t\right)={\psi}^{T}\left(t\right)\widehat{\theta}\left(t-1\right)$$
$$K\left(t\right)=Q\left(t\right)\psi \left(t\right)$$
$$Q\left(t\right)=\frac{P\left(t-1\right)}{\lambda +{\psi}^{T}\left(t\right)P\left(t-1\right)\psi \left(t\right)}$$
$$P\left(t\right)={\scriptscriptstyle \frac{1}{\lambda}}\left(P\left(t-1\right)-\frac{P\left(t-1\right)\psi \left(t\right)\psi {\left(t\right)}^{T}P\left(t-1\right)}{\lambda +\psi {\left(t\right)}^{T}P\left(t-1\right)\psi \left(t\right)}\right)$$
To obtain Q(t), the following function is minimized at time t:
$$\sum}_{k=1}^{t}{\lambda}^{t-k}{(y(k)-\widehat{y}(k))}^{2$$
This approach discounts old measurements exponentially such that an observation that is $$\tau $$ samples old carries a weight that is equal to $${\lambda}^{\tau}$$ times the weight of the most recent observation. $$\tau ={\scriptscriptstyle \raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$1-\lambda $}\right.}$$ represents the memory horizon of this algorithm. Measurements older than $$\tau ={\scriptscriptstyle \raisebox{1ex}{$1$}\!\left/ \!\raisebox{-1ex}{$1-\lambda $}\right.}$$ typically carry a weight that is less than about 0.3.
$$\lambda $$ is called the forgetting factor and typically has a positive value between 0.97 and 0.995.
Note: In the linear regression case, the forgetting factor algorithm is known as the recursive least-squares (RLS) algorithm. The forgetting factor algorithm for $$\lambda $$ = 1 is equivalent to the Kalman filter algorithm with R_{1}=0 and R_{2}=1. For more information about the Kalman filter algorithm, see Kalman Filter Algorithm. |
The general syntax for the command described in Algorithms for Recursive Estimation is the following:
[params,y_hat]=command(data,nn,adm,adg)
To specify the forgetting factor algorithm, set adm to 'ff' and adg to the value of the forgetting factor $$\lambda $$ (described in Mathematics of the Forgetting Factor Algorithm).
In the linear regression case, the gradient methods are also known as the least mean squares (LMS) methods.
The following set of equations summarizes the unnormalized gradient and normalized gradient adaptation algorithm:
$$\widehat{\theta}\left(t\right)=\widehat{\theta}\left(t-1\right)+K\left(t\right)\left(y\left(t\right)-\widehat{y}\left(t\right)\right)$$
$$\widehat{y}\left(t\right)={\psi}^{T}\left(t\right)\widehat{\theta}\left(t-1\right)$$
$$K\left(t\right)=Q\left(t\right)\psi \left(t\right)$$
In the unnormalized gradient approach, Q(t) is the product of the gain $$\gamma $$ and the identity matrix:
$$Q(t)=\gamma *\psi (t)$$
In the normalized gradient approach, Q(t) is the product of the gain $$\gamma $$, and the identity matrix is normalized by the magnitude of the gradient $$\psi \left(t\right)$$:
$$Q\left(t\right)=\frac{\gamma *\psi (t)}{{\left|\psi \left(t\right)\right|}^{2}}$$
These choices of Q(t) update the parameters in the negative gradient direction, where the gradient is computed with respect to the parameters.
The general syntax for the command described in Algorithms for Recursive Estimation is the following:
[params,y_hat]=command(data,nn,adm,adg)
To specify the unnormalized gain algorithm, set adm to 'ug' and adg to the value of the gain$$\gamma $$ (described in Mathematics of the Unnormalized and Normalized Gradient Algorithm).
To specify the normalized gain algorithm, set adm to 'ng' and adg to the value of the gain$$\gamma $$.