Main Content

This example shows how to apply the shorthand `regARIMA(p,D,q)`

syntax to specify the regression model with ARIMA errors.

Specify the default regression model with ARIMA(3,1,2) errors:

$$\begin{array}{c}{y}_{t}=c+{X}_{t}\beta +{u}_{t}\\ \left(1-{a}_{1}L-{a}_{2}{L}^{2}-{a}_{3}{L}^{3}\right)\left(1-L\right){u}_{t}=\left(1+{b}_{1}L+{b}_{2}{L}^{2}\right){\epsilon}_{t}.\end{array}$$

Mdl = regARIMA(3,1,2)

Mdl = regARIMA with properties: Description: "ARIMA(3,1,2) Error Model (Gaussian Distribution)" Distribution: Name = "Gaussian" Intercept: NaN Beta: [1×0] P: 4 D: 1 Q: 2 AR: {NaN NaN NaN} at lags [1 2 3] SAR: {} MA: {NaN NaN} at lags [1 2] SMA: {} Variance: NaN

The software sets each parameter to `NaN`

, and the innovation distribution to `Gaussian`

. The AR coefficients are at lags 1 through 3, and the MA coefficients are at lags 1 and 2. The property `P`

= *p* + *D* = 3 + 1 = 4. Therefore, the software requires at least four presample values to initialize the time series.

Pass `Mdl`

into `estimate`

with data to estimate the parameters set to `NaN`

. The `regARIMA`

model sets `Beta`

to `[]`

and does not display it. If you pass a matrix of predictors ($${X}_{t}$$) into `estimate`

, then `estimate`

estimates `Beta`

. The `estimate`

function infers the number of regression coefficients in `Beta`

from the number of columns in $${X}_{t}$$.

Tasks such as simulation and forecasting using `simulate`

and `forecast`

do not accept models with at least one `NaN`

for a parameter value. Use dot notation to modify parameter values.

Be aware that the regression model intercept (`Intercept`

) is not identifiable in regression models with ARIMA errors. If you want to `estimate`

`Mdl`

, then you must set `Intercept`

to a value using, for example, dot notation. Otherwise, `estimate`

might return a spurious estimate of `Intercept`

.

This example shows how to specify a regression model with ARIMA errors without a regression intercept.

Specify the default regression model with ARIMA(3,1,2) errors:

$$\begin{array}{c}{y}_{t}={X}_{t}\beta +{u}_{t}\\ \left(1-{a}_{1}L-{a}_{2}{L}^{2}-{a}_{3}{L}^{3}\right)\left(1-L\right){u}_{t}=\left(1+{b}_{1}L+{b}_{2}{L}^{2}\right){\epsilon}_{t}.\end{array}$$

Mdl = regARIMA('ARLags',1:3,'MALags',1:2,'D',1,'Intercept',0)

Mdl = regARIMA with properties: Description: "ARIMA(3,1,2) Error Model (Gaussian Distribution)" Distribution: Name = "Gaussian" Intercept: 0 Beta: [1×0] P: 4 D: 1 Q: 2 AR: {NaN NaN NaN} at lags [1 2 3] SAR: {} MA: {NaN NaN} at lags [1 2] SMA: {} Variance: NaN

The software sets `Intercept`

to 0, but all other parameters in `Mdl`

are `NaN`

values by default.

Since `Intercept`

is not a `NaN`

, it is an equality constraint during estimation. In other words, if you pass `Mdl`

and data into `estimate`

, then `estimate`

sets `Intercept`

to 0 during estimation.

In general, if you want to use `estimate`

to estimate a regression models with ARIMA errors where *D* > 0 or *s* > 0, then you must set `Intercept`

to a value before estimation.

You can modify the properties of `Mdl`

using dot notation.

This example shows how to specify a regression model with ARIMA errors, where the nonzero AR and MA terms are at nonconsecutive lags.

Specify the regression model with ARIMA(8,1,4) errors:

$$\begin{array}{c}{y}_{t}={X}_{t}\beta +{u}_{t}\\ (1-{a}_{1}L-{a}_{4}{L}^{4}-{a}_{8}{L}^{8})(1-L){u}_{t}=(1+{b}_{1}L+{b}_{4}{L}^{4}){\epsilon}_{t}.\end{array}$$

Mdl = regARIMA('ARLags',[1,4,8],'D',1,'MALags',[1,4],... 'Intercept',0)

Mdl = regARIMA with properties: Description: "ARIMA(8,1,4) Error Model (Gaussian Distribution)" Distribution: Name = "Gaussian" Intercept: 0 Beta: [1×0] P: 9 D: 1 Q: 4 AR: {NaN NaN NaN} at lags [1 4 8] SAR: {} MA: {NaN NaN} at lags [1 4] SMA: {} Variance: NaN

The AR coefficients are at lags 1, 4, and 8, and the MA coefficients are at lags 1 and 4. The software sets the interim lags to 0.

Pass `Mdl`

and data into `estimate`

. The software estimates all parameters that have the value `NaN`

. Then `estimate`

holds all interim lag coefficients to 0 during estimation.

This example shows how to specify values for all parameters of a regression model with ARIMA errors.

Specify the regression model with ARIMA(3,1,2) errors:

$$\begin{array}{c}{y}_{t}={X}_{t}\left[\begin{array}{l}2.5\\ -0.6\end{array}\right]+{u}_{t}\\ \left(1-0.7L+0.3{L}^{2}-0.1{L}^{3}\right)\left(1-L\right){u}_{t}=\left(1+0.5L+0.2{L}^{2}\right){\epsilon}_{t},\end{array}$$

where $${\epsilon}_{t}$$ is Gaussian with unit variance.

Mdl = regARIMA('Intercept',0,'Beta',[2.5; -0.6],... 'AR',{0.7, -0.3, 0.1},'MA',{0.5, 0.2},... 'Variance',1,'D',1)

Mdl = regARIMA with properties: Description: "Regression with ARIMA(3,1,2) Error Model (Gaussian Distribution)" Distribution: Name = "Gaussian" Intercept: 0 Beta: [2.5 -0.6] P: 4 D: 1 Q: 2 AR: {0.7 -0.3 0.1} at lags [1 2 3] SAR: {} MA: {0.5 0.2} at lags [1 2] SMA: {} Variance: 1

The parameters in `Mdl`

do not contain `NaN`

values, and therefore there is no need to estimate it. However, you can simulate or forecast responses by passing `Mdl`

to `simulate`

or `forecast`

.

This example shows how to set the innovation distribution of a regression model with ARIMA errors to a *t* distribution.

Specify the regression model with ARIMA(3,1,2) errors:

$$\begin{array}{c}{y}_{t}={X}_{t}\left[\begin{array}{l}2.5\\ -0.6\end{array}\right]+{u}_{t}\\ \left(1-0.7L+0.3{L}^{2}-0.1{L}^{3}\right)\left(1-L\right){u}_{t}=\left(1+0.5L+0.2{L}^{2}\right){\epsilon}_{t},\end{array}$$

where $${\epsilon}_{t}$$ has a *t* distribution with the default degrees of freedom and unit variance.

Mdl = regARIMA('Intercept',0,'Beta',[2.5; -0.6],... 'AR',{0.7, -0.3, 0.1},'MA',{0.5, 0.2},'Variance',1,... 'Distribution','t','D',1)

Mdl = regARIMA with properties: Description: "Regression with ARIMA(3,1,2) Error Model (t Distribution)" Distribution: Name = "t", DoF = NaN Intercept: 0 Beta: [2.5 -0.6] P: 4 D: 1 Q: 2 AR: {0.7 -0.3 0.1} at lags [1 2 3] SAR: {} MA: {0.5 0.2} at lags [1 2] SMA: {} Variance: 1

The default degrees of freedom is `NaN`

. If you don't know the degrees of freedom, then you can estimate it by passing `Mdl`

and the data to `estimate`

.

Specify a $${t}_{10}$$ distribution.

Mdl.Distribution = struct('Name','t','DoF',10)

Mdl = regARIMA with properties: Description: "Regression with ARIMA(3,1,2) Error Model (t Distribution)" Distribution: Name = "t", DoF = 10 Intercept: 0 Beta: [2.5 -0.6] P: 4 D: 1 Q: 2 AR: {0.7 -0.3 0.1} at lags [1 2 3] SAR: {} MA: {0.5 0.2} at lags [1 2] SMA: {} Variance: 1

You can simulate or forecast responses by passing `Mdl`

to `simulate`

or `forecast`

because `Mdl`

is completely specified.

In applications, such as simulation, the software normalizes the random *t* innovations. In other words, `Variance`

overrides the theoretical variance of the *t* random variable (which is `DoF`

/(`DoF`

- 2)), but preserves the kurtosis of the distribution.

`regARIMA`

| `estimate`

| `simulate`

| `forecast`

- Create Regression Models with ARIMA Errors
- Specify the Default Regression Model with ARIMA Errors
- Create Regression Models with AR Errors
- Create Regression Models with MA Errors
- Create Regression Models with ARMA Errors
- Create Regression Models with SARIMA Errors
- Specify ARIMA Error Model Innovation Distribution