Accelerating the pace of engineering and science

regARIMA class

Create regression model with ARIMA time series errors

Description

regARIMA creates a regression model with ARIMA time series errors to maintain the sensitivity interpretation of regression coefficients.

By default, the time series errors (also called unconditional disturbances) are independent, identically distributed, mean 0 Gaussian random variables. If the errors have an autocorrelation structure, then you can specify models for them. The models include:

• moving average (MA)

• autoregressive (AR)

• mixed autoregressive and moving average (ARMA)

• integrated (ARIMA)

• multiplicative seasonal (SARIMA)

Specify error models containing known coefficients to:

• Simulate responses using simulate.

• Explore impulse responses using impulse.

• Forecast future observations using forecast.

• Estimate unknown coefficients with data using estimate.

Construction

Mdl = regARIMA creates a regression model with degree 0 ARIMA errors and no regression coefficient.

Mdl = regARIMA(p,D,q) creates a regression model with errors modeled by a nonseasonal, linear time series with autoregressive degree p, differencing degree D, and moving average degree q.

Mdl = regARIMA(Name,Value) creates a regression model with ARIMA errors using additional options specified by one or more Name,Value pair arguments. Name can also be a property name and Value is the corresponding value. Name must appear inside single quotes (''). You can specify several Name,Value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Input Arguments

 Note:   For regression models with nonseasonal ARIMA errors, use p, D, and q. For regression models with seasonal ARIMA errors, use Name,Value pair arguments.
 p Nonseasonal, autoregressive polynomial degree for the error model, specified as a positive integer. D Nonseasonal integration degree for the error model, specified as a nonnegative integer. q Nonseasonal, moving average polynomial degree for the error model, specified as a positive integer.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

'Intercept'

Regression model intercept, specified as the comma-separated pair consisting of 'Intercept' and a scalar.

Default: NaN

'Beta'

Regression model coefficients associated with the predictor data, specified as the comma-separated pair consisting of 'Beta' and a vector.

Default: [] (no regression coefficients corresponding to predictor data)

'AR'

Nonseasonal, autoregressive coefficients for the error model, specified as the comma-separated pair consisting of 'AR' and a cell vector. The coefficients must yield a stable polynomial.

• If you specify ARLags, then AR is an equivalent-length cell vector of coefficients associated with the lags in ARLags. For example, if ARLags = [1, 4] and AR = {0.2, 0.1}, then, ignoring all other specifications, the error model is ${u}_{t}=0.2{u}_{t-1}+0.1{u}_{t-4}+{\epsilon }_{t}.$

• If you do not specify ARLags, then AR is a cell vector of coefficients at lags 1,2,...,p, which is the nonseasonal, autoregressive polynomial degree. For example, if AR = {0.2, 0.1} and you do not specify ARLags, then, ignoring all other specifications, the error model is ${u}_{t}=0.2{u}_{t-1}+0.1{u}_{t-2}+{\epsilon }_{t}.$

Default: Cell vector of NaNs with the same length as ARLags.

'MA'

Nonseasonal, moving average coefficients for the error model, specified as the comma-separated pair consisting of 'MA' and a cell vector. The coefficients must yield an invertible polynomial.

• If you specify MALags, then MA is an equivalent-length cell vector of coefficients associated with the lags in MALags. For example, if MALags = [1, 4] and MA = {0.2, 0.1}, then, ignoring all other specifications, the error model is ${u}_{t}={\epsilon }_{t}+0.2{\epsilon }_{t-1}+0.1{\epsilon }_{t-4}.$

• If you do not specify MALags, then MA is a cell vector of coefficients at lags 1,2,...,q, which is the nonseasonal, moving average polynomial degree. For example, if MA = {0.2, 0.1} and you do not specify MALags, then, ignoring all other specifications, the error model is ${u}_{t}={\epsilon }_{t}+0.2{\epsilon }_{t-1}+0.1{\epsilon }_{t-2}.$

Default: Cell vector of NaNs with the same length as MALags.

'ARLags'

Lags associated with the AR coefficients in the error model, specified as the comma-separated pair consisting of 'ARLags' and a vector of positive integers.

Default: Vector of integers 1,2,...,p, the nonseasonal, autoregressive polynomial degree.

'MALags'

Lags associated with the MA coefficients in the error model, specified as the comma-separated pair consisting of 'MALags' and a vector of positive integers.

Default: Vector of integers 1,2,...,q, the nonseasonal moving average polynomial degree.

'SAR'

Seasonal, autoregressive coefficients for the error model, specified as the comma-separated pair consisting of 'SAR' and a cell vector. The coefficient must yield a stable polynomial.

• If you specify SARLags, then SAR is an equivalent-length cell vector of coefficients associated with the lags in SARLags. For example, if SARLags = [1, 4], SAR = {0.2, 0.1}, and Seasonality = 4, then, ignoring all other specifications, the error model is

$\left(1-0.2L-0.1{L}^{4}\right)\left(1-{L}^{4}\right){u}_{t}={\epsilon }_{t}.$

• If you do not specify SARLags, then SAR is a cell vector of coefficients at lags 1,2,...,ps, which is the seasonal, autoregressive polynomial degree. For example, if SAR = {0.2, 0.1} and Seasonality = 4, and you do not specify SARLags, then, ignoring all other specifications, the error model is

$\left(1-0.2L-0.1{L}^{2}\right)\left(1-{L}^{4}\right){u}_{t}={\epsilon }_{t}.$

Default: Cell vector of NaNs with the same length as SARLags.

'SMA'

Seasonal, moving average coefficients for the error model, specified as the comma-separated pair consisting of 'SMA' and a cell vector. The coefficient must yield an invertible polynomial.

• If you specify SMALags, then SMA is an equivalent-length cell vector of coefficients associated with the lags in SMALags. For example, if SMALags = [1, 4], SMA = {0.2, 0.1}, and Seasonality = 4, then, ignoring all other specifications, the error model is $\left(1-{L}^{4}\right){u}_{t}=\left(1+0.2L+0.1{L}^{4}\right){\epsilon }_{t}.$

• If you do not specify SMALags, then SMA is a cell vector of coefficients at lags 1,2,...,qs, the seasonal, moving average polynomial degree. For example, if SMA = {0.2, 0.1} and Seasonality = 4, and you do not specify SMALags, then, ignoring all other specifications, the error model is $\left(1-{L}^{4}\right){u}_{t}=\left(1+0.2L+0.1{L}^{2}\right){\epsilon }_{t}.$

Default: Cell vector of NaNs with the same length as SMALags.

'SARLags'

Lags associated with the SAR coefficients in the error model, specified as the comma-separated pair consisting of 'SARLags' and a vector of positive integers.

Default: Vector of integers 1,2,...,ps, the seasonal, autoregressive polynomial degree.

'SMALags'

Lags associated with the SMA coefficients in the error model, specified as the comma-separated pair consisting of 'SMALags' and a vector of positive integers.

Default: Vector of integers 1,2,...,qs, the seasonal moving average polynomial degree.

'D'

Nonseasonal differencing polynomial degree (i.e., nonseasonal integration degree) for the error model, specified as the comma-separated pair consisting of 'D' and a nonnegative integer.

Default: 0 (no nonseasonal integration)

'Seasonality'

Seasonal differencing polynomial degree (i.e., seasonal integration degree) for the error model, specified as the comma-separated pair consisting of 'Seasonality' and a nonnegative integer.

Default: 0 (no seasonal integration)

'Variance'

Variance of the model innovations εt, specified as the comma-separated pair consisting of 'Variance' and a positive scalar.

Default: NaN

'Distribution'

Conditional probability distribution of the innovation process, specified as the comma-separated pair consisting of 'Distribution' and a string or a structure.

DistributionStringStructure
Gaussian'Gaussian'struct('Name','Gaussian')
Student's t
 't' By default, DoF is NaN.
 struct('Name','t','DoF',DoF) DoF > 2 or DoF = NaN

Default: 'Gaussian'

 Notes   Each AR, SAR, MA, and SMA coefficient is associated with an underlying lag operator polynomial and is subject to a near-zero tolerance exclusion test. That is, the software compares each coefficient to the default lag operator zero tolerance, 1e-12. If the magnitude of a coefficient is greater than 1e-12, then the software includes it in the model. Otherwise, the software considers the coefficient sufficiently close to 0, and excludes it from the model. For additional details, see LagOp.Specify the lags associated with the seasonal polynomials SAR and SMA in the periodicity of the observed data, and not as multiples of the Seasonality parameter. This convention does not conform to standard Box and Jenkins [1] notation, but it is a more flexible approach for incorporating multiplicative seasonality.

Properties

 AR Cell vector of nonseasonal, autoregressive coefficients corresponding to a stable polynomial of the error model. Associated lags are 1,2,...,p, which is the nonseasonal, autoregressive polynomial degree, or as specified in ARLags. Beta Real vector of regression coefficients corresponding to the columns of the predictor data matrix. D Nonnegative integer indicating the nonseasonal integration degree of the error model. Distribution Data structure for the conditional probability distribution of the innovation process. The field Name stores the distribution name 'Gaussian' or 't'. If the distribution is 't', then the structure also has the field DoF that stores the degrees of freedom. Intercept Scalar intercept in the error model. MA Cell vector of nonseasonal moving average coefficients corresponding to an invertible polynomial of the error model. Associated lags are 1,2,...,q to the degree of the nonseasonal moving average polynomial, or as specified in MALags. P Scalar, compound autoregressive polynomial degree of the error model. P is the total number of lagged observations necessary to initialize the autoregressive component of the error model. P includes the effects of nonseasonal and seasonal integration captured by the properties D and Seasonality, respectively, and the nonseasonal and seasonal autoregressive polynomials AR and SAR, respectively. P does not necessarily conform to standard Box and Jenkins notation [1]. If D = 0, Seasonality = 0, and SAR = {}, then P conforms to the standard notation. Q Scalar, compound moving average polynomial degree of the error model. Q is the total number of lagged innovations necessary to initialize the moving average component of the model. Q includes the effects of nonseasonal and seasonal moving average polynomials MA and SMA, respectively. Q does not necessarily conform to standard Box and Jenkins notation [1]. If SMA = {}, then Q conforms to the standard notation. SAR Cell vector of seasonal autoregressive coefficients corresponding to a stable polynomial of the error model. Associated lags are 1,2,...,ps, which is the seasonal autoregressive polynomial degree, or as specified in SARLags. SMA Cell vector of seasonal moving average coefficients corresponding to an invertible polynomial of the error model. Associated lags are 1,2,...,qs, which is the seasonal moving average polynomial degree, or as specified in SMALags. Seasonality Nonnegative integer indicating the seasonal integration degree of the error model. Variance Positive scalar variance of the model innovations.

Methods

 arima Convert regression model with ARIMA errors to ARIMAX model estimate Estimate parameters of regression models with ARIMA errors filter Filter disturbances through regression model with ARIMA errors forecast Forecast responses of regression model with ARIMA errors impulse Impulse response of regression model with ARIMA errors infer Infer innovations of regression models with ARIMA errors print Display estimation results for regression models with ARIMA errors simulate Monte Carlo simulation of regression model with ARIMA errors

Definitions

Regression Model with ARIMA Time Series Errors

A model that explains the behavior of a response using a linear regression model with predictor data, though the errors have autocorrelation indicative of an ARIMA process.

The model has the following form (in lag operator notation):

$\begin{array}{c}{y}_{t}=c+{X}_{t}\beta +{u}_{t}\\ a\left(L\right)A\left(L\right){\left(1-L\right)}^{D}\left(1-{L}^{s}\right){u}_{t}=b\left(L\right)B\left(L\right){\epsilon }_{t},\end{array}$

where

• t = 1,...,T.

• yt is the response series.

• Xt is row t of X, which is the matrix of concatenated predictor data vectors. That is, Xt is observation t of each predictor series.

• c is the regression model intercept.

• β is the regression coefficient.

• ut is the disturbance series.

• εt is the innovations series.

• ${L}^{j}{y}_{t}={y}_{t-j}.$

• $a\left(L\right)=\left(1-{a}_{1}L-...-{a}_{p}{L}^{p}\right),$ which is the degree p, nonseasonal autoregressive polynomial.

• $A\left(L\right)=\left(1-{A}_{1}L-...-{A}_{{p}_{s}}{L}^{{p}_{s}}\right),$ which is the degree ps, seasonal autoregressive polynomial.

• ${\left(1-L\right)}^{D},$ which is the degree D, nonseasonal integration polynomial.

• $\left(1-{L}^{s}\right),$ which is the degree s, seasonal integration polynomial.

• $b\left(L\right)=\left(1+{b}_{1}L+...+{b}_{q}{L}^{q}\right),$ which is the degree q, nonseasonal moving average polynomial.

• $B\left(L\right)=\left(1+{B}_{1}L+...+{B}_{{q}_{s}}{L}^{{q}_{s}}\right),$ which is the degree qs, seasonal moving average polynomial.

Regression models with ARIMA errors contain a hierarchy of error series. The unconditional disturbance, ut, or structural disturbance, is based on the structural regression component. The conditional error (one-step-ahead forecast or prediction error), εt is the innovation of ut.

 Note:   The degrees of the lag operators in the seasonal polynomials A(L) and B(L) do not conform to those defined by Box and Jenkins [1]. In other words, Econometrics Toolbox™ does not treat p1 = s, p2 = 2s,...,ps = cps nor q1 = s, q2 = 2s,...,qs = cqs where cp and cq are positive integers. The software is flexible as it lets you specify the lag operator degrees. See Multiplicative ARIMA Model Specifications.

Copy Semantics

Value. To learn how value classes affect copy operations, see Copying Objects in the MATLAB® documentation.

Examples

expand all

Specify a Regression Model with Nonseasonal ARIMA Errors

Specify the following regression model with ARIMA(2,1,3) errors:

Mdl = regARIMA(2,1,3)

Mdl =

ARIMA(2,1,3) Error Model:
--------------------------
Distribution: Name = 'Gaussian'
Intercept: NaN
P: 3
D: 1
Q: 3
AR: {NaN NaN} at Lags [1 2]
SAR: {}
MA: {NaN NaN NaN} at Lags [1 2 3]
SMA: {}
Variance: NaN


The output displays the values of the properties P, D, and Q of Mdl. The corresponding autoregressive and moving average coefficients (contained in AR and MA) are cell arrays containing the correct number of NaN values. Note that P = p + D = 3, indicating that you need three presample observations to initialize the model for estimation.

Modify a Regression Model with ARIMA Errors

Define the regression model with ARIMA errors:

where is Gaussian with variance 0.5.

Mdl = regARIMA('Intercept',2,'AR',{0.2 0.3},'MA',{0.1},...
'Variance',0.5,'Beta',[1.5 0.2])

Mdl =

Regression with ARIMA(2,0,1) Error Model:
------------------------------------------
Distribution: Name = 'Gaussian'
Intercept: 2
Beta: [1.5 0.2]
P: 2
D: 0
Q: 1
AR: {0.2 0.3} at Lags [1 2]
SAR: {}
MA: {0.1} at Lags [1]
SMA: {}
Variance: 0.5


Mdl is fully specified to, for example, simulate a series of responses given the predictor data matrix, .

Modify the model to estimate the regression coefficient, the AR terms, and the variance of the innovations.

Mdl.Beta = [NaN NaN];
Mdl.AR   = {NaN NaN};
Mdl.Variance = NaN;


Change the innovations distribution to a distribution with 15 degrees of freedom.

Mdl.Distribution = struct('Name','t','DoF',15)

Mdl =

Regression with ARIMA(2,0,1) Error Model:
------------------------------------------
Distribution: Name = 't', DoF = 15
Intercept: 2
Beta: [NaN NaN]
P: 2
D: 0
Q: 1
AR: {NaN NaN} at Lags [1 2]
SAR: {}
MA: {0.1} at Lags [1]
SMA: {}
Variance: NaN


Specify a Regression Model with SARIMA Errors

Specify the following model:

where is Gaussian with variance 1.

Mdl = regARIMA('Intercept',1,'Beta',6,'AR',0.2,...
'MA',0.1,'SAR',{0.5,0.2},'SARLags',[4, 8],...
'SMA',{0.05,0.01},'SMALags',[4 8],'D',1,...
'Seasonality',4,'Variance',1)

Mdl =

Regression with ARIMA(1,1,1) Error Model Seasonally Integrated with Seasonal AR(8) and MA(8):
------------------------------------------------------------------------------------------------
Distribution: Name = 'Gaussian'
Intercept: 1
Beta: [6]
P: 14
D: 1
Q: 9
AR: {0.2} at Lags [1]
SAR: {0.5 0.2} at Lags [4 8]
MA: {0.1} at Lags [1]
SMA: {0.05 0.01} at Lags [4 8]
Seasonality: 4
Variance: 1


If you do not specify SARLags or SMALags, then the coefficients in SAR and SMA correspond to lags 1 and 2 by default.

Mdl = regARIMA('Intercept',1,'Beta',6,'AR',0.2,...
'MA',0.1,'SAR',{0.5,0.2},'SMA',{0.05,0.01},...
'D',1,'Seasonality',4,'Variance',1)

Mdl =

Regression with ARIMA(1,1,1) Error Model Seasonally Integrated with Seasonal AR(2) and MA(2):
------------------------------------------------------------------------------------------------
Distribution: Name = 'Gaussian'
Intercept: 1
Beta: [6]
P: 8
D: 1
Q: 3
AR: {0.2} at Lags [1]
SAR: {0.5 0.2} at Lags [1 2]
MA: {0.1} at Lags [1]
SMA: {0.05 0.01} at Lags [1 2]
Seasonality: 4
Variance: 1


References

[1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.