Documentation |
On this page… |
---|
Autocorrelation is the linear dependence of a variable with itself at two points in time. For stationary processes, autocorrelation between any two observations only depends on the time lag h between them. Define Cov(y_{t}, y_{t–h}) = γ_{h}. Lag-h autocorrelation is given by
$${\rho}_{h}=Corr({y}_{t},{y}_{t-h})=\frac{{\gamma}_{h}}{{\gamma}_{0}}.$$
The denominator γ_{0} is the lag 0 covariance, i.e., the unconditional variance of the process.
Correlation between two variables can result from a mutual linear dependence on other variables (confounding). Partial autocorrelation is the autocorrelation between y_{t} and y_{t–h} after removing any linear dependence on y_{1}, y_{2}, ..., y_{t–h+1}. The partial lag-h autocorrelation is denoted $${\varphi}_{h,h}.$$
The autocorrelation function (ACF) for a time series y_{t}, t = 1,...,N, is the sequence $${\rho}_{h},$$ h = 1, 2,...,N – 1. The partial autocorrelation function (PACF) is the sequence $${\varphi}_{h,h},$$ h = 1, 2,...,N – 1.
The theoretical ACF and PACF for the AR, MA, and ARMA conditional mean models are known, and quite different for each model. The differences in ACF and PACF among models are useful when selecting models. The following summarizes the ACF and PACF behavior for these models.
Conditional Mean Model | ACF | PACF |
---|---|---|
AR(p) | Tails off gradually | Cuts off after p lags |
MA(q) | Cuts off after q lags | Tails off gradually |
ARMA(p,q) | Tails off gradually | Tails off gradually |
Sample autocorrelation and sample partial autocorrelation are statistics that estimate the theoretical autocorrelation and partial autocorrelation. As a qualitative model selection tool, you can compare the sample ACF and PACF of your data against known theoretical autocorrelation functions [1].
For an observed series y_{1}, y_{2},...,y_{T}, denote the sample mean $$\overline{y}.$$ The sample lag-h autocorrelation is given by
$${\widehat{\rho}}_{h}=\frac{{\displaystyle {\sum}_{t=h+1}^{T}({y}_{t}-\overline{y})({y}_{t-h}-\overline{y})}}{{\displaystyle {\sum}_{t=1}^{T}{({y}_{t}-\overline{y})}^{2}}}.$$
The standard error for testing the significance of a single lag-h autocorrelation, $${\widehat{\rho}}_{h}$$, is approximately
$$S{E}_{\rho}=\sqrt{(1+2{\displaystyle {\sum}_{i=1}^{h-1}{\widehat{\rho}}_{i}^{2}})/N}.$$
When you use autocorr to plot the sample autocorrelation function (also known as the correlogram), approximate 95% confidence intervals are drawn at $$\pm 2SE\rho $$ by default. Optional input arguments let you modify the calculation of the confidence bounds.
The sample lag-h partial autocorrelation is the estimated lag-h coefficient in an AR model containing h lags, $${\widehat{\varphi}}_{h,h}.$$ The standard error for testing the significance of a single lag-h partial autocorrelation is approximately $$1/\sqrt{N-1}.$$ When you use parcorr to plot the sample partial autocorrelation function, approximate 95% confidence intervals are drawn at $$\pm 2/\sqrt{N-1}$$ by default. Optional input arguments let you modify the calculation of the confidence bounds.
[1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.