MATLAB Examples

Test Simulated Data for a Unit Root

This example shows how to test univariate time series models for stationarity. It shows how to simulate data from four types of models: trend stationary, difference stationary, stationary (AR(1)), and a heteroscedastic, random walk model. It also shows that the tests yield expected results.

Simulate four time series.

T = 1e3;       % Sample size
t = (1:T)';    % Time multiple

rng(142857);   % For reproducibility

y1 = randn(T,1) + .2*t; % Trend stationary

Mdl2 = arima('D',1,'Constant',0.2,'Variance',1);
y2 = simulate(Mdl2,T,'Y0',0); % Difference stationary

Mdl3 = arima('AR',0.99,'Constant',0.2,'Variance',1);
y3 = simulate(Mdl3,T,'Y0',0); % AR(1)

Mdl4 = arima('D',1,'Constant',0.2,'Variance',1);
sigma = (sin(t/200) + 1.5)/2; % Std deviation
e = randn(T,1).*sigma;        % Innovations
y4 = filter(Mdl4,e,'Y0',0);   % Heteroscedastic

Plot the first 100 points in each series.

y = [y1 y2 y3 y4];
figure;
plot1 = plot(y(1:100,:));
plot1(1).LineWidth = 2;
plot1(3).LineStyle = ':';
plot1(3).LineWidth = 2;
plot1(4).LineStyle = ':';
plot1(4).LineWidth = 2;
title '{\bf First 100 Periods of Each Series}';
legend('Trend Stationary','Difference Stationary','AR(1)',...
   'Heteroscedastic','location','northwest');

All of the models appear nonstationary and behave similarly. Therefore, you might find it difficult to distinguish which series comes from which model simply by looking at their initial segments.

Plot the entire data set.

plot2 = plot(y);
plot2(1).LineWidth = 2;
plot2(3).LineStyle = ':';
plot2(3).LineWidth = 2;
plot2(4).LineStyle = ':';
plot2(4).LineWidth = 2;
title '{\bf Each Entire Series}';
legend('Trend Stationary','Difference Stationary','AR(1)',...
   'Heteroscedastic','location','northwest');

The differences between the series are clearer here:

  • The trend stationary series has little deviation from its mean trend.
  • The difference stationary and heteroscedastic series have persistent deviations away from the trend line.
  • The AR(1) series exhibits long-run stationary behavior; the others grow linearly.
  • The difference stationary and heteroscedastic series appear similar. However, that the heteroscedastic series has much more local variability near period 300, and much less near period 900. The model variance is maximal when $\sin(t/200) = 1$, at time $100\pi \approx 314$. The model variance is minimal when $\sin(t/200) = -1$, at time $300\pi \approx 942$. Therefore, the visual variability matches the model.

Use the Augmented Dicky-Fuller test on the three growing series (y1, y2, and y4) to assess whether the series have a unit root. Since the series are growing, specify that there is a trend. In this case, the null hypothesis is $H_0: y_t = y_{t-1} + c + b_1\Delta y_{t-1} + b_2\Delta y_{t-2} + \varepsilon_t$ and the alternative hypothesis is $H_1: y_t = ay_{t-1} + c + \delta t + b_1\Delta y_{t-1} + b_2\Delta y_{t-2} + \varepsilon_t$. Set the number of lags to 2 for demonstration purposes.

hY1 = adftest(y1, 'model','ts', 'lags',2)
hY2 = adftest(y2, 'model','ts', 'lags',2)
hY4 = adftest(y4, 'model','ts', 'lags',2)
hY1 =

  logical

   1


hY2 =

  logical

   0


hY4 =

  logical

   0

  • hY1 = 1 indicates that there is sufficient evidence to auggest that y1 is trend stationary. This is the correct decision because y1 is trend stationary by construction.
  • hY2 = 0 indicates that there is not enough evidence to suggest that y2 is trend stationary. This is the correct decision since y2 is difference stationary by construction.
  • hY4 = 0 indicates that there is not enough evidence to suggest that y4 is trend stationary. This is the correct decision, however, the Dickey-Fuller test is not appropriate for a heteroscedastic series.

Use the Augmented Dickey-Fuller test on the AR(1) series (y3) to assess whether the series has a unit root. Since the series is not growing, specify that the series is autoregressive with a drift term. In this case, the null hypothesis is $H_0: y_t = y_{t-1} + b_1\Delta y_{t-1} + b_2\Delta y_{t-2} + \varepsilon_t$ and the alternative hypothesis is $H_1: y_t = ay_{t-1} + b_1\Delta y_{t-1} + b_2\Delta y_{t-2} + \varepsilon_t$. Set the number of lags to 2 for demonstration purposes.

hY3 = adftest(y3, 'model','ard', 'lags',2)
hY3 =

  logical

   1

hY3 = 1 indicates that there is enough evidence to suggest that y3 is a stationary, autoregressive process with a drift term. This is the correct decision because y3 is an autoregressive process with a drift term by construction.

Use the KPSS test to assess whether the series are unit root nonstationary. Specify that there is a trend in the growing series (y1, y2, and y4). The KPSS test assumes the following model:

$$ y_y = c_t + \delta t + u_t$$

$$c_t = c_{t-1} + \varepsilon_t,$$

where $u_t$ is a stationary process and $\varepsilon_t$ is an independent and identically distributed process with mean 0 and variance $\sigma^2$. Whether there is a trend in the model, the null hypothesis is $H_0: \sigma^2 = 0$ (the series is trend stationary) and the alternative hypothesis is $H_1: \sigma^2 > 0$ (not trend stationary). Set the number of lags to 2 for demonstration purposes.

hY1 = kpsstest(y1, 'lags',2, 'trend',true)
hY2 = kpsstest(y2, 'lags',2, 'trend',true)
hY3 = kpsstest(y3, 'lags',2)
hY4 = kpsstest(y4, 'lags',2, 'trend',true)
hY1 =

  logical

   0


hY2 =

  logical

   1


hY3 =

  logical

   1


hY4 =

  logical

   1

All is tests result in the correct decision.

Use the variance ratio test on al four series to assess whether the series are random walks. The null hypothesis is $H_0$: $Var(\Delta y_t)$ is constant, and the alternative hypothesis is $H_1$: $Var(\Delta y_t)$ is not constant. Specify that the innovations are independent and identically distributed for all but y1. Test y4 both ways.

hY1 = vratiotest(y1)
hY2 = vratiotest(y2,'IID',true)
hY3 = vratiotest(y3,'IID',true)
hY4NotIID = vratiotest(y4)
hY4IID = vratiotest(y4, 'IID',true)
hY1 =

  logical

   1


hY2 =

  logical

   0


hY3 =

  logical

   0


hY4NotIID =

  logical

   0


hY4IID =

  logical

   0

All tests result in the correct decisions, except for hY4_2 = 0. This test does not reject the hypothesis that the heteroscedastic process is an IID random walk. This inconsistency might be associated with the random seed.

Alternatively, you can assess stationarity using pptest