Time Series Forecasting after taking first differences

31 views (last 30 days)
I am trying to forecast a series using a regression model with one independent variable. If I estimate the regression with the variables in level form the independent variable has the correct sign, and is statistically significant at the 1% level. If both variables are non-stationary, and I use a first difference of both series to make them stationary, and estimate the model again it is no longer statistically significant. Does this mean that this independent variable is a bad predictor to base a forecast on? Does the level form regression only appear to have a strong relationship, because of the problems with non-stationary data? What if only the dependent variable is non-stationary, and I estimate a regression with a first difference of the dependent variable, and level form of the independent variable?

Answers (1)

Hang Qian
Hang Qian on 14 May 2016
Hi shackelferd,
Time series regression of non-stationary, but not cointegrated, data may suffer from the “spurious regression”, and it is likely to exhibit a high R^2 and significant parameter estimators.
It does not necessarily mean that the independent variable is a bad predictor, but it is necessary to check the residuals of the regression. If residuals are in a nice shape and look stationary, it is likely a cointegration. In that case, it is legitimate to trust your regression and the independent variable is a good predictor. If you prefer first-difference the data, consider adding an error correction term in the regression.
There are a lot of tests for stationarity/non-stationarity that provide guidance on whether a variable should be differenced before running a regression. Keep in mind that all those tests are asymptotical and many have low power. However, all dataset are finite. With a finite sample, it is not possible to distinguish the unit root from sufficiently persistent stationary series. So it is your belief that eventually determines stationarity and you are the boss to determine whether to use level or first-differenced data.
If the ultimate goal is prediction, consider out-of-sample forecast and cross-validation. All is well that forecasts well.
Regards,
- Hang Qian

Categories

Find more on Multivariate Models in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!