# filter

Class: ssm

Forward recursion of state-space models

## Syntax

• X = filter(Mdl,Y) example
• X = filter(Mdl,Y,Name,Value) example
• [X,logL,Output] = filter(___) example

## Description

example

X = filter(Mdl,Y) performs forward recursion of the fully specified, linear state-space model Mdl to estimate filtered states (X). That is, filter applies the Kalman filter using the state-space model Mdl and the observed responses Y to estimate X.

example

X = filter(Mdl,Y,Name,Value) performs forward recursion of the state-space model Mdl with additional options specified by one or more Name,Value pair arguments. If Mdl is not fully specified, then you must set the unknown parameters to known scalars using the Name,Value pair argument Params.

example

[X,logL,Output] = filter(___) additionally returns the loglikelihood value (logL) and an output structure array (Output) using any of the input arguments in the previous syntaxes. Output contains:Filtered and forecasted statesEstimated covariance matrices of the filtered and forecasted statesLoglikelihood valueForecasted observations and its estimated covariance matrixAdjusted Kalman gainVector indicating which data the software used to filter

## Input Arguments

collapse all

### Mdl — State-space modelssm model

State-space model, specified as a model returned by ssm or estimate.

If Mdl is not fully specified (that is, Mdl contains unknown parameters), then specify values for the unknown parameters using the 'Params' name-value pair argument. Otherwise, the software throws an error.

Mdl does not store observed responses or predictor data. Supply the data wherever necessary using the appropriate input or name-value pair arguments.

### Y — Observed response datacell vector of numeric vectors | matrix

Observed response data to which Mdl is fit, specified as a cell vector of numeric vectors or a matrix.

• If Mdl is time invariant with respect to the observation equation, then Y is a T-by-n matrix, where each row corresponds to a period and each column corresponds to a particular observation in the model. Therefore, T is the sample size and m is the number of observations per period. The last row of Y contains the latest observations.

• If Mdl is time varying with respect to the observation equation, then Y is a T-by-1 cell vector. Each element of the cell vector corresponds to a period and contains an nt-dimensional vector of observations for that period. The corresponding dimensions of the coefficient matrices in Mdl.C{t} and Mdl.D{t} must be consistent with the matrix in Y{t} for all periods. The last cell of Y contains the latest observations.

### Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

### 'Beta' — Regression coefficients[] (default) | matrix

Regression coefficients corresponding to predictor variables, specified as the comma-separated pair consisting of 'Beta' and a matrix.

In general, each column of Beta corresponds to an observation in the state-space model, and each row corresponds to a predictor. Therefore, Beta must have as many columns as the dimension of the observation model, and rows equal to the number of predictors.

### 'Params' — Values for unknown parametersnumeric vector

Values for unknown parameters in the state-space model, specified as the column-separated pair consisting of 'Params' and a numeric vector.

The elements of Params correspond to the unknown parameters in the state-space model matrices A, B, C, and D, and, optionally, the initial state mean Mean0 and covariance matrix Cov0.

• If you created Mdl explicitly (that is, by specifying the matrices without a parameter-to-matrix mapping function), then the software maps the elements of Params to NaNs in the state-space model matrices and initial state values. The software searches for NaNs column-wise following the order A, B, C, D, Mean0, and Cov0.

• If you created Mdl implicitly (that is, by specifying the matrices with a parameter-to-matrix mapping function), then you must set initial parameter values for the state-space model matrices, initial state values, and state types within the parameter-to-matrix mapping function.

If Mdl contains unknown parameters, then you must specify their values. Otherwise, the software ignores the value of Params.

### 'Predictors' — Predictor variables in state-space model observation equation[] (default) | matrix

Predictor variables in the state-space model observation equation, specified as the comma-separated pair consisting of 'Predictors' and a matrix. The columns of Predictors correspond to individual predictor variables. Predictors must have T rows, where row t corresponds to the observed predictors at period t (Zt). The expanded observation equation is

${y}_{t}-{Z}_{t}\beta =C{x}_{t}+D{u}_{t}.$

that is, the software deflates the observations using the regression component. β is the time-invariant vector of regression coefficients that the software estimates with all other parameters.

If there are n observations per period, then the software regresses all predictor series onto each observation.

If you specify Predictors, then Mdl must be time invariant. Otherwise, the software returns an error.

By default, the software excludes a regression component from the state-space model.

Data Types: double

### 'SquareRoot' — Square root filter method flagfalse (default) | true

Square root filter method flag, specified as the comma-separated pair consisting of 'SquareRoot' and true or false.

Example: 'SquareRoot',true

Data Types: logical

### 'Tolerance' — Forecast uncertainty threshold0 (default) | nonnegative scalar

Forecast uncertainty threshold, specified as the comma-separated pair consisting of 'Tolerance' and a nonnegative scalar.

If the forecast uncertainty for a particular observation is less than Tolerance during numerical estimation, then the software removes the uncertainty corresponding to the observation from the forecast covariance matrix before its inversion.

It is best practice to set Tolerance to a small number, for example, le-15, to overcome numerical obstacles during estimation.

Example: 'Tolerance',le-15

Data Types: double

### 'Univariate' — Univariate treatment of multivariate series flagfalse (default) | true

Univariate treatment of a multivariate series flag, specified as the comma-separated pair consisting of 'Univariate' and true or false.

DtDt' must be diagonal, where Dt is one of the following:

• The matrix D{t} in a time-varying state-space model

• The matrix D in a time-invariant state-space model

Example: 'Univariate',true

Data Types: logical

## Output Arguments

collapse all

### X — Filtered statesmatrix | cell vector of vectors

Filtered states, returned as a matrix or a cell vector of matrices.

If the state-space model Mdl is time invariant, then the number of rows of X is the sample size, and the number of columns of X is the number of states. The last row of X contains the latest, filtered states.

If the state-space model Mdl is time varying, then X is a cell vector with length equal to the sample size. Cell t of X contains a vector of filtered states with length equal to the number of states in period t. The last cell of X contains the latest, filtered states.

Data Types: cell | double

### logL — Loglikelihood function valuescalar

Loglikelihood function value, returned as a scalar.

Data Types: double

### Output — Filtering results by periodstructure array

Filtering results by period, returned as a structure array.

Output is a T-by-1 structure, where element t corresponds to the filtering result at time t.

• If Univariate is false (it is by default), then the following table outlines the fields of Output.

FieldDescriptionEstimate of
LogLikelihoodScalar loglikelihood objective function valueN/A
FilteredStatesmt-by-1 vector of filtered states$E\left({x}_{t}|{y}_{1},...,{y}_{t}\right)$
FilteredStatesCovmt-by-mt variance-covariance matrix of filtered states$Var\left({x}_{t}|{y}_{1},...,{y}_{t}\right)$
ForecastedStatesmt-by-1 vector of state forecasts$E\left({x}_{t}|{y}_{1},...,{y}_{t-1}\right)$
ForecastedStatesCovmt-by-mt variance-covariance matrix of state forecasts$Var\left({x}_{t}|{y}_{1},...,{y}_{t-1}\right)$
ForecastedObsht-by-1 forecasted observation vector$E\left({y}_{t}|{y}_{1},...,{y}_{t-1}\right)$
ForecastedObsCovht-by-ht variance-covariance matrix of forecasted observations$Var\left({y}_{t}|{y}_{1},...,{t}_{t-1}\right)$
KalmanGainmt-by-nt adjusted Kalman gain matrixN/A
DataUsedht-by-1 logical vector indicating whether the software filters using a particular observation. For example, if observation i at time t is a NaN, then element i in DataUsed at time t is 0.N/A

• If Univarite is true, then the fields of Output are the same as in the previous table, except for the following amendments.

FieldChanges
ForecastedObsSame dimensions as if Univariate = 0, but only the first elements are equal
ForecastedObsCov

n-by-1 vector of forecasted observation variances.

The first element of this vector is equivalent to ForecastedObsCov(1,1) when Univariate is false. The rest of the elements are not necessarily equivalent to their corresponding values in ForecastObsCov when Univariate.

KalmanGainSame dimensions as if Univariate is false, though KalmanGain might have different entries.

Data Types: struct

## Definitions

Consider obtaining the 1-step-ahead states forecast for period t + 1 using all information up to period t. The adjusted Kalman gain (${K}_{adj,t}$) is the amount of weight put on the estimated observation innovation for period t (${\stackrel{^}{\epsilon }}_{t}$) as compared to the 2-step-ahead state forecast (${\stackrel{^}{x}}_{t+1|t-1}$).

That is,

${\stackrel{^}{x}}_{t+1|t}={A}_{t}{\stackrel{^}{x}}_{t|t}={A}_{t}{\stackrel{^}{x}}_{t|t-1}+{A}_{t}{K}_{t}{\stackrel{^}{\epsilon }}_{t}={\stackrel{^}{x}}_{t+1|t-1}+{K}_{adj,t}{\stackrel{^}{\epsilon }}_{t}.$

### Filtered States

States forecasts at period t, updated using all information (for example the observed responses) up to period t.

The mt-by-1 vector of filtered states at period t is ${x}_{t|t}=E\left({x}_{t}|{y}_{t},...,{y}_{1}\right)$. The estimated vector of filtered states is

${\stackrel{^}{x}}_{t|t}={\stackrel{^}{x}}_{t|t-1}+{K}_{t}{\stackrel{^}{\epsilon }}_{t},$

where:

• ${\stackrel{^}{x}}_{t|t-1}$ is the vector of state forecasts at period t using the observed responses from periods 1 through t – 1.

• Kt is the mt-by-ht raw Kalman gain matrix for period t.

• ${\stackrel{^}{\epsilon }}_{t}={y}_{t}-{C}_{t}{\stackrel{^}{x}}_{t|t-1}$ is the ht-by-1 vector of estimated observation innovations at period t.

In other words, the filtered states at period t are the forecasted states at period t plus an adjustment based on the trustworthiness of the observation. Trustworthy observations have very little corresponding observation innovation variance (for example, the maximum eigenvalue of DtDt is relatively small). Consequently, for a given estimated observation innovation, the term ${K}_{t}{\stackrel{^}{\epsilon }}_{t}$ has a higher impact on the values of the filtered states than untrustworthy observations.

At period t, the filtered states have variance-covariance matrix

${P}_{t|t}={P}_{t|t-1}-{K}_{t}{C}_{t}{P}_{t|t-1}^{\prime },$

where ${P}_{t|t-1}$ is the estimated variance-covariance matrix of the state forecasts at period t, given all information up to period t – 1.

### Forecasted Observations

s-step-ahead, forecasted observations are estimates of the observations at period t using all information (for example, the observed responses) up to period ts.

The nt-by-1 vector of 1-step-ahead, forecasted observations at period t is ${y}_{t|t-1}=E\left({y}_{t}|{y}_{t-1},...,{y}_{1}\right)$. The estimated vector of forecasted observations is

${\stackrel{^}{y}}_{t|t-1}={C}_{t}{\stackrel{^}{x}}_{t|t-1},$

where ${\stackrel{^}{x}}_{t|t-1}$ is the mt-by-1 estimated vector of state forecasts at period t.

At period t, the 1-step-ahead, forecasted observations have variance-covariance matrix

${V}_{t|t-1}=Var\left({y}_{t}|{y}_{t-1},...,{y}_{1}\right)={C}_{t}{P}_{t|t-1}{C}_{t}^{\prime }+{D}_{t}{D}_{t}^{\prime }.$

where ${P}_{t|t-1}$ is the estimated variance-covariance matrix of the state forecasts at period t, given all information up to period t – 1.

In general, the s-step-ahead, vector of state forecasts is ${x}_{t|t-s}=E\left({x}_{t}|{y}_{t-s},...,{y}_{1}\right)$. The s-step-ahead, forecasted observation vector is

${\stackrel{^}{y}}_{t+s|t}={C}_{t+s}{\stackrel{^}{x}}_{t+s|t}.$

### Kalman Filter

In the state-space model framework, the Kalman filter estimates the values of a latent, linear, stochastic, dynamic process based on possibly mismeasured observations. Given distribution assumptions on the uncertainty, the Kalman filter also estimates model parameters via maximum likelihood.

Starting with initial values for states (x0|0), the initial state variance-covariance matrix (P0|0), and initial values for all unknown parameters (θ0), the simple Kalman filter:

1. Estimates, for t = 1,...,T:

1. The 1-step-ahead vector of state forecasts vector for period t (${\stackrel{^}{x}}_{t|t-1}$) and its variance-covariance matrix (${P}_{t|t-1}$)

2. The 1-step-ahead vector of observation forecasts for period t (${\stackrel{^}{y}}_{t|t-1}$) and its estimated variance-covariance matrix (${V}_{t|t-1}$)

3. The filtered states for period t (${\stackrel{^}{x}}_{t|t}$) and its estimated variance-covariance matrix (${P}_{t|t}$)

2. Feeds the forecasted and filtered estimates into the data likelihood function

$\mathrm{ln}p\left({y}_{T},...,{y}_{1}\right)=\sum _{t=1}^{T}\mathrm{ln}\varphi \left({y}_{t};{\stackrel{^}{y}}_{t|t-1},{V}_{t|t-1}\right),$

where $\varphi \left({y}_{t};{\stackrel{^}{y}}_{t|t-1},{V}_{t|t-1}\right)$ is the multivariate normal probability density function with mean ${\stackrel{^}{y}}_{t|t-1}$ and variance ${V}_{t|t-1}$.

3. Feeds this procedure into an optimizer to maximize the likelihood with respect to the model parameters.

### Raw Kalman Gain

The raw Kalman gain is a matrix that indicates how much to weigh the observations during recursions of the Kalman filter.

The raw Kalman gain is an mt -by-ht matrix computed using

${K}_{t}={P}_{t|t-1}{C}_{t}^{\prime }{\left({C}_{t}{P}_{t|t-1}{C}_{t}^{\prime }+{D}_{t}{D}_{t}^{\prime }\right)}^{-1},$

where ${P}_{t|t-1}$ is the estimated variance-covariance matrix of the state forecasts, given all information up to period t – 1.

The value of the raw Kalman gain determines how much weight to put on the observations. For a given estimated observation innovation, if the maximum eigenvalue of DtDt is relatively small, then the raw Kalman gain imparts a relatively large weight on the observations. If the maximum eigenvalue of DtDt is relatively large, then the raw Kalman gain imparts a relatively small weight on the observations. Consequently, the filtered states at period t are close to the corresponding state forecasts.

### State Forecasts

s-step-ahead, state forecasts are estimates of the states at period t using all information (for example, the observed responses) up to period ts.

The mt-by-1 vector of 1-step-ahead, state forecasts at period t is ${x}_{t|t-1}=E\left({x}_{t}|{y}_{t-1},...,{y}_{1}\right)$. The estimated vector of state forecasts is

${\stackrel{^}{x}}_{t|t-1}={A}_{t}{\stackrel{^}{x}}_{t-1|t-1},$

where ${\stackrel{^}{x}}_{t-1|t-1}$ is the mt – 1-by-1 filtered state vector at period t – 1.

At period t, the 1-step-ahead, state forecasts have the variance-covariance matrix

${P}_{t|t-1}={A}_{t}{P}_{t-1|t-1}{A}_{t}^{\prime }+{B}_{t}{B}_{t}^{\prime },$

where${P}_{t-1|t-1}$ is the estimated variance-covariance matrix of the filtered states at period t – 1, given all information up to period t – 1.

The corresponding 1-step-ahead forecasted observation is ${\stackrel{^}{y}}_{t|t-1}={C}_{t}{\stackrel{^}{x}}_{t|t-1},$, and its variance-covariance matrix is ${V}_{t|t-1}=Var\left({y}_{t}|{y}_{t-1},...,{y}_{1}\right)={C}_{t}{P}_{t|t-1}{C}_{t}^{\prime }+{D}_{t}{D}_{t}^{\prime }.$

In general, the s-step-ahead, forecasted state vector is ${x}_{t|t-s}=E\left({x}_{t}|{y}_{t-s},...,{y}_{1}\right)$. The s-step-ahead, vector of state forecasts is

${\stackrel{^}{x}}_{t+s|t}=\left(\prod _{j=t+1}^{t+s}{A}_{j}\right){x}_{t|t}$

and the s-step-ahead, forecasted observation vector is

${\stackrel{^}{y}}_{t+s|t}={C}_{t+s}{\stackrel{^}{x}}_{t+s|t}.$

### State-Space Model

A state-space model is a discrete-time, stochastic model that contains two sets of equations:

• One describing how a latent process transitions in time (the state equation)

• Another describing how an observer measures the latent process at each period (the observation equation)

Symbolically, you can write a linear, multivariate, Gaussian state-space model using the following system of equations

$\begin{array}{l}{x}_{t}={A}_{t}{x}_{t-1}+{B}_{t}{u}_{t}\\ {y}_{t}-{Z}_{t}\beta ={C}_{t}{x}_{t}+{D}_{t}{\epsilon }_{t},\end{array}$

for t = 1,...,T.

• ${x}_{t}=\left[{x}_{t1},...,{x}_{t{m}_{t}}\right]\prime$ is an mt-dimensional state vector describing the dynamics of some, possibly unobservable, phenomenon at period t.

• ${y}_{t}=\left[{y}_{t1},...,{y}_{t{n}_{t}}\right]\prime$ is an nt-dimensional observation vector describing how the states are measured by observers at period t.

• At is the mt-by-mt – 1 state-transition matrix describing how the states at time t transition to the states at period t – 1.

• Bt is the mt-by-kt state-disturbance-loading matrix describing how the states at period t combine with the innovations at period t.

• Ct is the nt-by-mt measurement-sensitivity matrix describing how the observations at period t relate to the states at period t.

• Dt is the nt-by-ht observation-innovation matrix describing how the observations at period t combine with the observation errors at period t.

• The matrices At, Bt, Ct, and Dt are referred to as coefficient matrices, and might contain unknown parameters.

• ${u}_{t}=\left[{u}_{t1},...,{u}_{t{k}_{t}}\right]\prime$ is a kt-dimensional, Gaussian, white-noise, unit-variance vector of state disturbances at period t.

• ${\epsilon }_{t}=\left[{\epsilon }_{t1},...,{\epsilon }_{t{h}_{t}}\right]\prime$ is an ht-dimensional, Gaussian, white-noise, unit-variance vector of observation innovations at period t.

• εt and ut are uncorrelated.

• For time-invariant models,

• ${Z}_{t}=\left[\begin{array}{cccc}{z}_{t1}& {z}_{t2}& \cdots & {z}_{td}\end{array}\right]$ is row t of a T-by-d matrix of predictors Z. Each column of Z corresponds to a predictor, and each successive row to a successive period. If the observations are multivariate, then all predictors deflate each observation.

• β is a d-by-n matrix of regression coefficients for Zt.

To write a time-invariant state-space model, drop the t subscripts of all coefficient matrices and dimensions.

### Time-Invariant Model

In a time-invariant state-space model:

• The coefficient matrices are equivalent for all periods.

• The number of states, state disturbances, observations, and observation innovations are the same for all periods.

For example, for all t, the following system of equations

$\begin{array}{c}\left[\begin{array}{c}{x}_{1,t}\\ {x}_{2,t}\end{array}\right]=\left[\begin{array}{cc}{\varphi }_{1}& 0\\ 0& {\varphi }_{2}\end{array}\right]\left[\begin{array}{c}{x}_{1,t-1}\\ {x}_{2,t-1}\end{array}\right]+\left[\begin{array}{cc}0.5& 0\\ 0& 2\end{array}\right]\left[\begin{array}{c}{u}_{1,t}\\ {u}_{2,t}\end{array}\right]\\ {y}_{t}=\left[\begin{array}{cc}{\varphi }_{3}& 1\end{array}\right]\left[\begin{array}{c}{x}_{1,t}\\ {x}_{2,t}\end{array}\right]+0.2{\epsilon }_{t}\end{array}$

represents a time-invariant state-space model.

### Time-Varying Model

In a time-varying state-space model:

• The coefficient matrices might change from period to period.

• The number of states, state disturbances, observations, and observation innovations might change from period to period. For example, this might happen if there is a regime shift or one of the states or observations cannot be measured during the sampling time frame. Also, you can model seasonality using time-varying models.

To illustrate a regime shift, suppose, for t = 1,..,10

$\begin{array}{c}\left[\begin{array}{c}{x}_{1,t}\\ {x}_{2,t}\end{array}\right]=\left[\begin{array}{cc}{\varphi }_{1}& 0\\ 0& {\varphi }_{2}\end{array}\right]\left[\begin{array}{c}{x}_{1,t-1}\\ {x}_{2,t-1}\end{array}\right]+\left[\begin{array}{cc}0.5& 0\\ 0& 2\end{array}\right]\left[\begin{array}{c}{u}_{1,t}\\ {u}_{2,t}\end{array}\right]\\ {y}_{t}=\left[\begin{array}{cc}{\varphi }_{3}& 1\end{array}\right]\left[\begin{array}{c}{x}_{1,t}\\ {x}_{2,t}\end{array}\right]+0.2{\epsilon }_{t}\end{array},$

for t = 11

$\begin{array}{c}{x}_{1,t}=\left[\begin{array}{cc}{\varphi }_{4}& 0\end{array}\right]\left[\begin{array}{c}{x}_{1,t-1}\\ {x}_{2,t-1}\end{array}\right]+0.5{u}_{1,t}\\ {y}_{t}={\varphi }_{5}{x}_{1,t}+0.2{\epsilon }_{t}\end{array},$

and for t = 12,..,T

$\begin{array}{c}{x}_{1,t}={\varphi }_{4}+0.5{u}_{1,t}\\ {y}_{t}={\varphi }_{5}{x}_{1,t}+0.2{\epsilon }_{t}\end{array}.$

There are three sets of state transition matrices, whereas there are only two sets of the other coefficient matrices.

## Examples

collapse all

### Filter States of a Known, Time-Invariant State-Space Model

Suppose that a latent process is an AR(1). Subsequently, the state equation is

where is Gaussian with mean 0 and standard deviation 1.

Genrate a random series of 100 observations from , assuming that the series starts at 1.5.

T = 100; ARMdl = arima('AR',0.5,'Constant',0,'Variance',1); x0 = 1.5; rng(1); % For reproducibility x = simulate(ARMdl,T,'Y0',x0); 

Suppose further that the latent process is subject to additive measurement error. Subsequently, the observation equation is

where is Gaussian with mean 0 and standard deviation 0.75. Together, the latent process and observation equations compose a state-space model.

Use the random latent state process (x) and the observation equation to generate observations.

y = x + 0.75*randn(T,1); 

Specify the four coefficient matrices.

A = 0.5; B = 1; C = 1; D = 0.75; 

Specify the state-space model using the coefficient matrices.

Mdl = ssm(A,B,C,D) 
Mdl = State vector length: 1 Observation vector length: 1 State disturbance vector length: 1 Observation innovation vector length: 1 Sample size supported by model: Unlimited State variables: x1, x2,... State disturbances: u1, u2,... Observation series: y1, y2,... Observation innovations: e1, e2,... State equation: x1(t) = (0.50)x1(t-1) + u1(t) Observation equation: y1(t) = x1(t) + (0.75)e1(t) Initial state distribution: Initial state means x1 0 Initial state covariance matrix x1 x1 1.33 State types x1 Stationary 

Mdl is an ssm model. Verify that the model is correctly specified using the display in the Command Window. The software infers that the state process is stationary. Subsequently, the software sets the initial state mean and covariance to the mean and variance of the stationary distribution of an AR(1) model.

Filter states for periods 1 through 100. Plot the true state values and the filtered state estimates.

filteredX = filter(Mdl,y); figure plot(1:T,x,'-k',1:T,filteredX,':r','LineWidth',2) title({'State Values'}) xlabel('Period') ylabel('State') legend({'True state values','Filtered state values'}) 

The true values and filter estimates are approximately the same.

### Filter States of a State-Space Model That Includes a Regression Component

Suppose that the linear relationship between the change in the unemployment rate and the nominal gross national product (nGNP) growth rate is of interest. Suppose further that the first difference of the unemployment rate is an ARMA(1,1) series. Symbolically, and in state-space form, the model is

where:

• is the change in the unemployment rate at time t.

• is a dummy state for the MA(1) effect.

• is the observed change in the unemployment rate being deflated by the growth rate of nGNP ( ).

• is the Gaussian series of state disturbances having mean 0 and standard deviation 1.

• is the Gaussian series of observation innovations having mean 0 and standard deviation .

Load the Nelson-Plosser data set, which contains the unemployment rate and nGNP series, among other things.

load Data_NelsonPlosser 

Preprocess the data by taking the natural logarithm of the nGNP series, and the first difference of each series. Also, remove the starting NaN values from each series.

isNaN = any(ismissing(DataTable),2); % Flag periods containing NaNs gnpn = DataTable.GNPN(~isNaN); u = DataTable.UR(~isNaN); T = size(gnpn,1); % Sample size Z = [ones(T-1,1) diff(log(gnpn))]; y = diff(u); 

Though this example removes missing values, the software can accommodate series containing missing values in the Kalman filter framework.

Specify the coefficient matrices.

A = [NaN NaN; 0 0]; B = [1; 1]; C = [1 0]; D = NaN; 

Specify the state-space model using ssm.

Mdl = ssm(A,B,C,D); 

Estimate the model parameters, and use a random set of initial parameter values for optimization. Specify the regression component and its initial value for optimization using the 'Predictors' and 'Beta0' name-value pair arguments, respectively. Restrict the estimate of to all positive, real numbers.

params0 = [0.3 0.2 0.2]; [EstMdl,estParams] = estimate(Mdl,y,params0,'Predictors',Z,... 'Beta0',[0.1 0.2],'lb',[-Inf,-Inf,0,-Inf,-Inf]); 
Method: Maximum likelihood (fmincon) Sample size: 61 Logarithmic likelihood: -99.7245 Akaike info criterion: 209.449 Bayesian info criterion: 220.003 | Coeff Std Err t Stat Prob ---------------------------------------------------------- c(1) | -0.34098 0.29608 -1.15164 0.24948 c(2) | 1.05003 0.41377 2.53771 0.01116 c(3) | 0.48592 0.36790 1.32079 0.18657 y <- z(1) | 1.36121 0.22338 6.09358 0 y <- z(2) | -24.46711 1.60018 -15.29024 0 | | Final State Std Dev t Stat Prob x(1) | 1.01264 0.44690 2.26592 0.02346 x(2) | 0.77718 0.58917 1.31912 0.18713 

EstMdl is an ssm model, and you can access its properties using dot notation.

Filter the estimated state-space model. EstMdl does not store the data or the regression coefficients, so you must pass in them in using the name-value pair arguments 'Predictors' and 'Beta', respectively. Plot the estimated, filtered states. Recall that the first state is the change in the unemployment rate, and the second state helps build the first.

filteredX = filter(EstMdl,y,'Predictors',Z,'Beta',estParams(end-1:end)); figure plot(dates(end-(T-1)+1:end),filteredX(:,1)); xlabel('Period') ylabel('Change in the unemployment rate') title('Filtered Change in the Unemployment Rate') 

### Filter States of a Fitted, Time-Varying State-Space Model

This example generates data from a known model, fits a state-space model to the data, and then filters the states.

Suppose that a latent processes is composed of an AR(2) and an MA(1) model. There are 50 periods, and the MA(1) process drops out of the model for the final 25 periods. Subsequently, the state equation for the first 25 periods is

$\begin{array}{l}{x}_{1,t}=0.7{x}_{1,t-1}-0.2{x}_{1,t-2}+{u}_{1,t}\\ {x}_{2,t}={u}_{2,t}+0.6{u}_{2,t-1},\end{array}$

and for the last 25 periods, it is

${x}_{1,t}=0.7{x}_{1,t-1}-0.2{x}_{1,t-2}+{u}_{1,t},$

where u1,t and u2,t are Gaussian with mean 0 and standard deviation 1.

Generate a random series of 50 observations from x1,t and x2,t, assuming that the series starts at 1.5 and 1, respectively.

T = 50; ARMdl = arima('AR',{0.7,-0.2},'Constant',0,'Variance',1); MAMdl = arima('MA',0.6,'Constant',0,'Variance',1); x0 = [1.5 1; 1.5 1]; rng(1); x = [simulate(ARMdl,T,'Y0',x0(:,1)),... [simulate(MAMdl,T/2,'Y0',x0(:,2));nan(T/2,1)]];

The last 25 values for the simulated MA(1) data are NaNs.

Suppose further that the latent processes are measured using

${y}_{t}=2\left({x}_{1,t}+{x}_{2,t}\right)+{\epsilon }_{t},$

for the first 25 periods, and

${y}_{t}=2{x}_{1,t}+{\epsilon }_{t}$

for the last 25 periods, where εt is Gaussian with mean 0 and standard deviation 1.

Use the random latent state process (x) and the observation equation to generate observations.

y = 2*nansum(x')'+randn(T,1);

Together, the latent process and observation equations compose a state-space model. Supposing that the coefficients are unknown parameters, the state-space model is

$\begin{array}{l}\left[\begin{array}{c}{x}_{1,t}\\ {x}_{2,t}\\ {x}_{3,t}\\ {x}_{4,t}\end{array}\right]=\left[\begin{array}{cccc}{\varphi }_{1}& {\varphi }_{2}& 0& 0\\ 1& 0& 0& 0\\ 0& 0& 0& {\theta }_{1}\\ 0& 0& 0& 0\end{array}\right]\left[\begin{array}{c}{x}_{1,t-1}\\ {x}_{2,t-1}\\ {x}_{3,t-1}\\ {x}_{4,t-1}\end{array}\right]+\left[\begin{array}{cc}1& 0\\ 0& 0\\ 0& 1\\ 0& 1\end{array}\right]\left[\begin{array}{c}{u}_{1,t}\\ {u}_{2,t}\end{array}\right]\\ {y}_{t}=a\left({x}_{1,t}+{x}_{3,t}\right)+{\epsilon }_{t}\end{array}$

for the first 25 periods,

$\begin{array}{l}\left[\begin{array}{c}{x}_{1,t}\\ {x}_{2,t}\end{array}\right]=\left[\begin{array}{cccc}{\varphi }_{1}& {\varphi }_{2}& 0& 0\\ 1& 0& 0& 0\end{array}\right]\left[\begin{array}{c}{x}_{1,t-1}\\ {x}_{2,t-1}\\ {x}_{3,t-1}\\ {x}_{4,t-1}\end{array}\right]+\left[\begin{array}{c}1\\ 0\end{array}\right]{u}_{1,t}\\ {y}_{t}=b{x}_{1,t}+{\epsilon }_{t}\end{array}$

for period 26, and

$\begin{array}{l}\left[\begin{array}{c}{x}_{1,t}\\ {x}_{2,t}\end{array}\right]=\left[\begin{array}{cc}{\varphi }_{1}& {\varphi }_{2}\\ 1& 0\end{array}\right]\left[\begin{array}{c}{x}_{1,t-1}\\ {x}_{2,t-1}\end{array}\right]+\left[\begin{array}{c}1\\ 0\end{array}\right]{u}_{1,t}\\ {y}_{t}=b{x}_{1,t}+{\epsilon }_{t}\end{array}$

for the last 24 periods.

Create the function Ar2MAParamMap.m, which specifies how the parameters in params map to the state-space model matrices, the initial state values, and the type of state.

function [A,B,C,D,Mean0,Cov0,StateType] = AR2MAParamMap(params,T) %AR2MAParamMap Time-variant state-space model parameter mapping function % % This function maps the vector params to the state-space matrices (A, B, % C, and D), the initial state value and the initial state variance (Mean0 % and Cov0), and the type of state (StateType). From periods 1 to T/2, the % state model is an AR(2) and an MA(1) model, and the observation model is % the sum of the two states. From periods T/2 + 1 to T, the state model is % just the AR(2) model. A1 = {[params(1) params(2) 0 0; 1 0 0 0; 0 0 0 params(3); 0 0 0 0]}; B1 = {[1 0; 0 0; 0 1; 0 1]}; C1 = {params(4)*[1 0 1 0]}; Mean0 = ones(4,1); Cov0 = 10*eye(4); StateType = [0 0 0 0]; A2 = {[params(1) params(2) 0 0; 1 0 0 0]}; B2 = {[1; 0]}; A3 = {[params(1) params(2); 1 0]}; B3 = {[1; 0]}; C3 = {params(5)*[1 0]}; A = [repmat(A1,T/2,1);A2;repmat(A3,(T-2)/2,1)]; B = [repmat(B1,T/2,1);B2;repmat(B3,(T-2)/2,1)]; C = [repmat(C1,T/2,1);repmat(C3,T/2,1)]; D = 1; end 

Specify the state-space model by passing the function AR2MAParamMap as a function handle to ssm.

Mdl = ssm(@(params)AR2MAParamMap(params,T));

The software implicitly defines the state-space model. Usually, you cannot verify state-space models that you implicitly define.

Pass the observed responses (y) to estimate to estimate the parameters. Set positive, random initial values for the unknown parameters.

params0 = rand(5,1); EstMdl = estimate(Mdl,y,params0);
Method: Maximum likelihood (fminunc) Sample size: 50 Logarithmic likelihood: -114.957 Akaike info criterion: 239.913 Bayesian info criterion: 249.473 | Coeff Std Err t Stat Prob --------------------------------------------------- c(1) | 0.47870 0.26634 1.79733 0.07229 c(2) | 0.00809 0.27179 0.02976 0.97626 c(3) | 0.55735 0.80958 0.68844 0.49118 c(4) | 1.62679 0.41622 3.90849 0.00009 c(5) | 1.90022 0.49564 3.83390 0.00013 | | Final State Std Dev t Stat Prob x(1) | -0.81229 0.46815 -1.73511 0.08272 x(2) | -0.31449 0.45917 -0.68490 0.49341 

EstMdl is an ssm model containing the estimated coefficients. Likelihood surfaces of state-space models might contain local maxima. Therefore, it is good practice to try several initial parameter values, or consider using refine.

Filter the states and obtain state forecasts by passing EstMdl and the observed responses to filter.

[~,~,Output]= filter(EstMdl,y);

Output is a T-by-1 structure array containing the filtered states and state forecasts, among other things.

Extract the filtered and forecasted states from the cell arrays. Recall that the two, different states are in positions 1 and 3. The states in positions 2 and 4 help specify the processes of interest.

stateIndx = [1 3]; % State indices of interest FilteredStates = NaN(T,numel(stateIndx)); ForecastedStates = NaN(T,numel(stateIndx)); for t = 1:T maxInd = size(Output(t).FilteredStates,1); mask = stateIndx <= maxInd; FilteredStates(t,mask) = Output(t).FilteredStates(stateIndx(mask),1); ForecastedStates(t,mask) = Output(t).ForecastedStates(stateIndx(mask),1); end

Plot the true state values, the filtered state values, and the state forecasts together for each model.

figure plot(1:T,x(:,1),'-k',1:T,FilteredStates(:,1),':r',... 1:T,ForecastedStates(:,1),'--g','LineWidth',2); title('AR(2) State Values') xlabel('Period') ylabel('State Value') legend({'True state values','Filtered state values','State forecasts'}); figure plot(1:T,x(:,2),'-k',1:T,FilteredStates(:,2),':r',... 1:T,ForecastedStates(:,2),'--g','LineWidth',2); title('MA(1) State Values') xlabel('Period') ylabel('State Value') legend({'True state values','Filtered state values','State forecasts'});

## Tips

• The software accommodates missing data. Indicate missing data using NaN values in the observed responses (Y).

• Mdl does not store the response data, predictor data, and the regression coefficients. Supply the data wherever necessary using the appropriate input or name-value pair arguments.

• To forecast a state-space model, use forecast.

## References

[1] Durbin J., and S. J. Koopman. Time Series Analysis by State Space Methods. 2nd ed. Oxford: Oxford University Press, 2012.