Box-Jenkins Differencing vs. ARIMA Estimation

This example shows how to estimate an ARIMA model with nonseasonal integration using estimate. The series is not differenced before estimation. The results are compared to a Box-Jenkins modeling strategy, where the data are first differenced, and then modeled as a stationary ARMA model (Box et al., 1994).

The time series is the log quarterly Australian Consumer Price Index (CPI) measured from 1972 through 1991.

Load the Data

Load and plot the Australian CPI data.

load Data_JAustralian
y = DataTable.PAU;
T = length(y);

figure
plot(y);
h = gca;        % Define a handle for the current axes
h.XLim = [0,T]; % Set x-axis limits
h.XTickLabel = datestr(dates(1:10:T),17); % Label x-axis tick marks
title('Log Quarterly Australian CPI')

The series is nonstationary, with a clear upward trend. This suggests differencing the data before using a stationary model (as suggested by the Box-Jenkins methodology), or fitting a nonstationary ARIMA model directly.

Estimate an ARIMA Model

Specify an ARIMA(2,1,0) model, and estimate.

Mdl = arima(2,1,0);
EstMdl = estimate(Mdl,y);
 
    ARIMA(2,1,0) Model:
    --------------------
    Conditional Probability Distribution: Gaussian

                                  Standard          t     
     Parameter       Value          Error       Statistic 
    -----------   -----------   ------------   -----------
     Constant      0.0100723    0.00328015        3.07069
        AR{1}       0.212059     0.0954278        2.22219
        AR{2}       0.337282      0.103781        3.24994
     Variance    9.23017e-05   1.11119e-05        8.30659

The estimated model is

$$\Delta {y_t} = 0.01 + 0.21\Delta {y_{t - 1}} + 0.34\Delta {y_{t - 2}} + {\varepsilon _t},$$

where $\varepsilon_t$ is normally distributed with standard deviation 0.01.

The signs of the estimated AR coefficients correspond to the AR coefficients on the right side of the model equation. In lag operator polynomial notation, the fitted model is

$$(1 - 0.21L - 0.34{L^2})(1 - L){y_t} = {\varepsilon _t},$$

with the opposite sign on the AR coefficients.

Difference the Data Before Estimating

Take the first difference of the data. Estimate an AR(2) model using the differenced data.

dY = diff(y);
MdlAR = arima(2,0,0);
EstMdlAR = estimate(MdlAR,dY);
 
    ARIMA(2,0,0) Model:
    --------------------
    Conditional Probability Distribution: Gaussian

                                  Standard          t     
     Parameter       Value          Error       Statistic 
    -----------   -----------   ------------   -----------
     Constant      0.0104289    0.00380427        2.74137
        AR{1}       0.201194      0.101463        1.98293
        AR{2}        0.32299      0.118035         2.7364
     Variance    9.42421e-05   1.16259e-05        8.10622

The parameter point estimates are very similar to those in EstMdl. The standard errors, however, are larger when the data is differenced before estimation.

Forecasts made using the estimated AR model (EstMdlAR) will be on the differenced scale. Forecasts made using the estimated ARIMA model (EstMdl) will be on the same scale as the original data.

References:

Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

See Also

|

Related Examples

More About

Was this topic helpful?