Presample data comes from time points before the beginning of the observation period. In Econometrics Toolbox™, you can specify your own presample data or use generated presample data.
In a conditional mean model, the distribution of εt is conditional on historical information. Historical information includes past responses, , past innovations, , and, if you include them in the model, past and present exogenous covariates, .
The number of past responses and innovations that a current innovation depends on is determined by the degree of the AR or MA operators, and any differencing. For example, in an AR(2) model, each innovation depends on the two previous responses,
In ARIMAX models, the current innovation also depends on the current value of the exogenous covariate (unlike distributed lag models). For example, in an ARX(2) model with one exogenous covariate, each innovation depends on the previous two responses and the current value of the covariate,
In general, the likelihood contribution of the first few innovations is conditional on historical information that might not be observable. How do you estimate the parameters without all the data? In the ARX(2) example, explicitly depends on and and explicitly depends on and . Implicitly, depends on and and depends on and However, you cannot observe and
The amount of presample data that you need to initialize a
model depends on the degree of the model. The property
arima model specifies the number of presample
responses and exogenous data that you need to initialize the AR portion
of a conditional mean model. For example,
P = 2 in
an ARX(2) model. Therefore, you need two responses and two data points
from each exogenous covariate series to initialize
One option is to use the first
P data from
the response and exogenous covariate series as your presample, and
then fit your model to the remaining data. This results in some loss
of sample size. If you plan to compare multiple potential models,
be aware that you can only use likelihood-based measures of fit (including
the likelihood ratio test and information criteria) to compare models
fit to the same data (of the same sample size). If you specify your
own presample data, then you must use the largest required number
of presample responses across all models that you want to compare.
Q of an
specifies the number of presample innovations needed to initialize
the MA portion of a conditional mean model. You can get presample
innovations by dividing your data into two parts. Fit a model to the
first part, and infer the innovations. Then, use the inferred innovations
as presample innovations for estimating the second part of the data.
For a model with both an autoregressive and moving average component, you can specify both presample responses and innovations, one or the other, or neither.
estimate generates automatic
presample response and innovation data. The software:
Generates presample responses by backward forecasting.
Sets presample innovations to zero.
Does not generate presample exogenous data. One option is to backward forecast each exogenous series to generate a presample during data preprocessing.