A univariate time series yt is integrated if it can be brought to stationarity through differencing. The number of differences required to achieve stationarity is called the order of integration. Time series of order d are denoted I(d). Stationary series are denoted I(0).
An n-dimensional time series yt is cointegrated if some linear combination β1y1t + … + βnynt of the component variables is stationary. The combination is called a cointegrating relation, and the coefficients β = (β1 , … , βn)′ form a cointegrating vector. Cointegration is usually associated with systems of I(1) variables, since any I(0) variables are trivially cointegrated with other variables using a vector with coefficient 1 on the I(0) component and coefficient 0 on the other components. The idea of cointegration can be generalized to systems of higher-order variables if a linear combination reduces their common order of integration.
Cointegration is distinguished from traditional economic equilibrium, in which a balance of forces produces stable long-term levels in the variables. Cointegrated variables are generally unstable in their levels, but exhibit mean-reverting "spreads" (generalized by the cointegrating relation) that force the variables to move around common stochastic trends. Cointegration is also distinguished from the short-term synchronies of positive covariance, which only measures the tendency to move together at each time step. Modification of the VAR model to include cointegrated variables balances the short-term dynamics of the system with long-term tendencies.
The tendency of cointegrated variables to revert to common stochastic trends is expressed in terms of error-correction. If yt is an n-dimensional time series and β is a cointegrating vector, then the combination β′yt−1 measures the "error" in the data (the deviation from the stationary mean) at time t−1. The rate at which series "correct" from disequilibrium is represented by a vector α of adjustment speeds, which are incorporated into the VAR model at time t through a multiplicative error-correction term αβ′yt−1.
In general, there may be multiple cointegrating relations among the variables in yt, in which case the vectors α and β become matrices A and B, with each column of B representing a specific relation. The error-correction term becomes AB′yt−1 = Cyt−1. Adding the error-correction term to a VAR model in differences produces the vector error-correction (VEC) model:
If the variables in yt are all I(1), the terms involving differences are stationary, leaving only the error-correction term to introduce long-term stochastic trends. The rank of the impact matrix C determines the long-term dynamics. If C has full rank, the system yt is stationary in levels. If C has rank 0, the error-correction term disappears, and the system is stationary in differences. These two extremes correspond to standard choices in univariate modeling. In the multivariate case, however, there are intermediate choices, corresponding to reduced ranks between 0 and n. If C is restricted to reduced rank r, then C factors into (nonunique) n-by-r matrices A and B with C = AB′, and there are r independent cointegrating relations among the variables in yt.
By collecting differences, a VEC(q) model can be converted to a VAR(p) model in levels, with p = q+1:
Because of the equivalence of the two representations, a VEC model with a reduced-rank error-correction coefficient is often called a cointegrated VAR model. In particular, cointegrated VAR models can be simulated and forecast using standard VAR techniques.
The cointegrated VAR model is often augmented with exogenous terms Dx:
Variables in x may include seasonal or interventional dummies, or deterministic terms representing trends in the data. Since the model is expressed in differences ∆yt, constant terms in x represent linear trends in the levels of yt and linear terms represent quadratic trends. In contrast, constant and linear terms in the cointegrating relations have the usual interpretation as intercepts and linear trends, although restricted to the stationary variable formed by the cointegrating relation. Johansen  considers five cases for AB´yt−1 + Dx which cover the majority of observed behaviors in macroeconomic systems:
|Case||Form of AB′yt − 1 + Dx||Model Interpretation|
|H2||AB′yt − 1||There are no intercepts or trends in the cointegrating relations and there are no trends in the data. This model is only appropriate if all series have zero mean.|
|H1*||A(B′yt − 1 + c0)||There are intercepts in the cointegrating relations and there are no trends in the data. This model is appropriate for nontrending data with nonzero mean.|
|H1||A(B′yt − 1+c0) + c1||There are intercepts in the cointegrating relations and there are linear trends in the data. This is a model of deterministic cointegration, where the cointegrating relations eliminate both stochastic and deterministic trends in the data.|
|H*||A(B′yt − 1 + c0 + d0t) + c1||There are intercepts and linear trends in the cointegrating relations and there are linear trends in the data. This is a model of stochastic cointegration, where the cointegrating relations eliminate stochastic but not deterministic trends in the data.|
|H||A(B′yt − 1 + c0 + d0t) + c1 + d1t||There are intercepts and linear trends in the cointegrating relations and there are quadratic trends in the data. Unless quadratic trends are actually present in the data, this model may produce good in-sample fits but poor out-of-sample forecasts.|
In Econometrics Toolbox™, deterministic terms outside of the cointegrating relations, c1 and d1, are identified by projecting constant and linear regression coefficients, respectively, onto the orthogonal complement of A.
Integration and cointegration both present opportunities for transforming variables to stationarity. Integrated variables, identified by unit root and stationarity tests, can be differenced to stationarity. Cointegrated variables, identified by cointegration tests, can be combined to form new, stationary variables. In practice, it must be determined if such transformations lead to more reliable models, with variables that retain an economic interpretation.
Generalizing from the univariate case can be misleading. In the standard Box-Jenkins  approach to univariate ARMA modeling, stationarity is an essential assumption. Without it, the underlying distribution theory and estimation techniques become invalid. In the corresponding multivariate case, where the VAR model is unrestricted and there is no cointegration, choices are less straightforward. If the goal of a VAR analysis is to determine relationships among the original variables, differencing loses information. In this context, Sims, Stock, and Watson  advise against differencing, even in the presence of unit roots. If, however, the goal is to simulate an underlying data-generating process, integrated levels data can cause a number of problems. Model specification tests lose power due to an increase in the number of estimated parameters. Other tests, such as those for Granger causality, no longer have standard distributions, and become invalid. Finally, forecasts over long time horizons suffer from inconsistent estimates, due to impulse responses that do not decay. Enders  discusses modeling strategies.
In the presence of cointegration, simple differencing is a model misspecification, since long-term information appears in the levels. Fortunately, the cointegrated VAR model provides intermediate options, between differences and levels, by mixing them together with the cointegrating relations. Since all terms of the cointegrated VAR model are stationary, problems with unit roots are eliminated.
Cointegration modeling is often suggested, independently, by economic theory. Examples of variables that are commonly described with a cointegrated VAR model include:
Money stock, interest rates, income, and prices (common models of money demand)
Investment, income, and consumption (common models of productivity)
Consumption and long-term income expectation (Permanent Income Hypothesis)
Exchange rates and prices in foreign and domestic markets (Purchasing Power Parity)
Spot and forward currency exchange rates and interest rates (Covered Interest Rate Parity)
Interest rates of different maturities (Term Structure Expectations Hypothesis)
Interest rates and inflation (Fisher Equation)
Since these theories describe long-term equilibria among the variables, accurate estimation of cointegrated models may require large amounts of low-frequency (annual, quarterly, monthly) macroeconomic data. As a result, these models must consider the possibility of structural changes in the underlying data-generating process during the sample period.
Financial data, by contrast, is often available at high frequencies (hours, minutes, microseconds). The mean-reverting spreads of cointegrated financial series can be modeled and examined for arbitrage opportunities. For example, the Law of One Price suggests cointegration among the following groups of variables:
Prices of assets with identical cash flows
Prices of assets and dividends
Spot, future, and forward prices
Bid and ask prices