Stochastic Process Characteristics

What Is a Stochastic Process?

A time series y_t is a collection of observations on a variable indexed sequentially over several time points t = 1, 2,...,T. Time series observations y₁, y₂,...,y_T are inherently dependent. From a statistical modeling perspective, this means it is inappropriate to treat a time series as a random sample of independent observations.

The goal of statistical modeling is finding a compact representation of the data-generating process for your data. The statistical building block of econometric time series modeling is the stochastic process. Heuristically, a stochastic process is a joint probability distribution for a collection of random variables. By modeling the observed time series y_t as a realization from a stochastic process $y = {y_{t}; t = 1, ..., T}$ , it is possible to accommodate the high-dimensional and dependent nature of the data. The set of observation times T can be discrete or continuous. Figure 1-1, Monthly Average CO2 displays the monthly average CO₂ concentration (ppm) recorded by the Mauna Loa Observatory in Hawaii from 1980 to 2012 [3].

Figure 1-1, Monthly Average CO2

Time series plot showing Monthly Average CO2 Concentration over Moana Loa with the years 1980 through 2012 on the x axis and CO2 Concentration in parts per million on the y axis.

Stationary Processes

Stochastic processes are weakly stationary or covariance stationary (or simply, stationary) if their first two moments are finite and constant over time. Specifically, if y_t is a stationary stochastic process, then for all t:

E(y_t) = μ < ∞.
V(y_t) = $σ^{2}$ < ∞.
Cov(y_t, y_t–h) = γ_h for all lags $h \neq 0.$

Does a plot of your stochastic process seem to increase or decrease without bound? The answer to this question indicates whether the stochastic process is stationary. “Yes” indicates that the stochastic process might be nonstationary. In Figure 1-1, Monthly Average CO2, the concentration of CO₂ is increasing without bound which indicates a nonstationary stochastic process.

Linear Time Series Model

Wold’s theorem [2] states that you can write all weakly stationary stochastic processes in the general linear form

$y_{t} = μ + \sum_{i = 1}^{\infty} ψ_{i} ε_{t - i} + ε_{t} .$

Here, $ε_{t}$ denotes a sequence of uncorrelated (but not necessarily independent) random variables from a well-defined probability distribution with mean zero. It is often called the innovation process because it captures all new information in the system at time t.

Unit Root Process

A linear time series model is a unit root process if the solution set to its characteristic equation contains a root that is on the unit circle (i.e., has an absolute value of one). Subsequently, the expected value, variance, or covariance of the elements of the stochastic process grows with time, and therefore is nonstationary. If your series has a unit root, then differencing it might make it stationary.

For example, consider the linear time series model $y_{t} = y_{t - 1} + ε_{t},$ where $ε_{t}$ is a white noise sequence of innovations with variance σ² (this is called the random walk). The characteristic equation of this model is $z - 1 = 0,$ which has a root of one. If the initial observation y₀ is fixed, then you can write the model as $y_{t} = y_{0} + \sum_{i = 1}^{t} ε_{i} .$ Its expected value is y₀, which is independent of time. However, the variance of the series is tσ², which grows with time making the series unstable. Take the first difference to transform the series and the model becomes $d_{t} = y_{t} - y_{t - 1} = ε_{t}$ . The characteristic equation for this series is $z = 0$ , so it does not have a unit root. Note that

$E (d_{t}) = 0,$ which is independent of time,
$V (d_{t}) = σ^{2},$ which is independent of time, and
$C o v (d_{t}, d_{t - s}) = 0,$ which is independent of time for all integers 0 < s < t.

Figure 1-1, Monthly Average CO2 appears nonstationary. What happens if you plot the first difference d_t = y_t – y_t–1 of this series? Figure 1-2, Monthly Difference in CO2 displays the d_t. Ignoring the fluctuations, the stochastic process does not seem to increase or decrease in general. You can conclude that d_t is stationary, and that y_t is unit root nonstationary. For details, see Differencing.

Figure 1-2, Monthly Difference in CO2

Time series plot showing Monthly Difference of Average CO2 Concentration over Moana Loa with the years 1980 through 2012 on the x axis and Difference in CO2 Concentration in parts per million on the y axis.

Lag Operator Notation

The lag operator L operates on a time series y_t such that $L^{i} y_{t} = y_{t - i}$ .

An mth-degree lag polynomial of coefficients b₁, b₂,...,b_m is defined as

$B (L) = (1 + b_{1} L + b_{2} L^{2} + \dots + b_{m} L^{m}) .$

In lag operator notation, you can write the general linear model using an infinite-degree polynomial $ψ (L) = (1 + ψ_{1} L + ψ_{2} L^{2} + \dots),$

$y_{t} = μ + ψ (L) ε_{t} .$

You cannot estimate a model that has an infinite-degree polynomial of coefficients with a finite amount of data. However, if $ψ (L)$ is a rational polynomial (or approximately rational), you can write it (at least approximately) as the quotient of two finite-degree polynomials.

Define the q-degree polynomial $θ (L) = (1 + θ_{1} L + θ_{2} L^{2} + \dots + θ_{q} L^{q})$ and the p-degree polynomial $ϕ (L) = (1 + ϕ_{1} L + ϕ_{2} L^{2} + \dots + ϕ_{p} L^{p})$ . If $ψ (L)$ is rational, then

$ψ (L) = \frac{θ (L)}{ϕ (L)} .$

Thus, by Wold’s theorem, you can model (or closely approximate) every stationary stochastic process as

$y_{t} = μ + \frac{θ (L)}{ϕ (L)} ε_{t},$

which has p + q coefficients (a finite number).

Characteristic Equation

A degree p characteristic polynomial of the linear time series model $y_{t} = ϕ_{1} y_{t - 1} + ϕ_{2} y_{t - 2} + ... + ϕ_{p} y_{t - p} + ε_{t}$ is

$ϕ (a) = a^{p} - ϕ_{1} a^{p - 1} - ϕ_{2} a^{p - 2} - ... - ϕ_{p} .$

It is another way to assess that a series is a stationary process. For example, the characteristic equation of $y_{t} = 0.5 y_{t - 1} - 0.02 y_{t - 2} + ε_{t}$ is $ϕ (a) = a^{2} - 0.5 a + 0.02.$

The roots of the homogeneous characteristic equation $ϕ (a) = 0$ (called the characteristic roots) determine whether the linear time series is stationary. If every root in $ϕ (a)$ lies inside the unit circle, then the process is stationary. Roots lie within the unit circle if they have an absolute value less than one. This is a unit root process if one or more roots lie inside the unit circle (i.e., have absolute value of one). Continuing the example, the characteristic roots of $ϕ (a) = 0$ are $a = {0.4562, 0.0438} .$ Since the absolute values of these roots are less than one, the linear time series model is stationary.

References

[1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Wold, H. A Study in the Analysis of Stationary Time Series. Uppsala, Sweden: Almqvist & Wiksell, 1938.

[3] Tans, P., and R. Keeling. (2012, August). “Trends in Atmospheric Carbon Dioxide.” NOAA Research. Retrieved October 5, 2012 from https://gml.noaa.gov/ccgg/trends/mlo.html.