estimate

Class: regARIMA

Estimate parameters of regression models with ARIMA errors

Syntax

EstMdl = estimate(Mdl,y)
[EstMdl,EstParamCov,logL,info] = estimate(Mdl,y)
[EstMdl,EstParamCov,logL,info] = estimate(Mdl,y,Name,Value)

Description

EstMdl = estimate(Mdl,y) uses maximum likelihood to estimate the parameters of the regression model with ARIMA time series errors, Mdl, given the response series y. EstMdl is a regARIMA model that stores the results.

[EstMdl,EstParamCov,logL,info] = estimate(Mdl,y) additionally returns EstParamCov, the variance-covariance matrix associated with estimated parameters, logL, the optimized loglikelihood objective function, and info, a data structure of summary information.

[EstMdl,EstParamCov,logL,info] = estimate(Mdl,y,Name,Value) estimates the model using additional options specified by one or more Name,Value pair arguments.

Input Arguments

expand all

Mdl — Regression model with ARIMA errorsregARIMA model

Regression model with ARIMA errors, specified as a regARIMA model returned by regARIMA or estimate.

estimate treats non-NaN elements in Mdl as equality constraints, and does not estimate the corresponding parameters.

y — Single path of response datanumeric column vector

Single path of response data to which the model is fit, specified as a numeric column vector. The last observation of y is the latest.

Data Types: double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

'AR0' — Initial estimates of ARIMA error model nonseasonal autoregressive coefficientsnumeric vector

Initial estimates of ARIMA error model nonseasonal autoregressive coefficients, specified as the comma-separated pair consisting of 'AR0' and a numeric vector.

The number of coefficients in AR0 must equal the number of lags associated with nonzero coefficients in the nonseasonal autoregressive polynomial.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

'Beta0' — Initial estimates of regression coefficientsnumeric vector

Initial estimates of regression coefficients, specified as the comma-separated pair consisting of 'Beta0' and a numeric vector.

The number of coefficients in Beta0 must equal the number of columns of X.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

'Display' — Command Window display option'params' (default) | 'diagnostics' | 'full' | 'iter' | 'off' | cell vector of strings

Command Window display option, specified as the comma-separated pair consisting of 'Display' and a string or cell vector of strings.

Set Display using any combination of values in this table.

Valueestimate Displays
'diagnostics'Optimization diagnostics
'full'Maximum likelihood parameter estimates, standard errors, t statistics, iterative optimization information, and optimization diagnostics
'iter'Iterative optimization information
'off'Nothing in the Command Window
'params'Maximum likelihood parameter estimates, standard errors, and t statistics

For example,

  • To run a simulation where you are fitting many models, and therefore want to suppress all output, use 'Display','off'.

  • To display all estimation results and the optimization diagnostics, use 'Display',{'params','diagnostics'}.

Data Types: char | cell

'DoF0' — Initial t-distribution degree-of-freedom estimate10 (default) | positive scalar

Initial t-distribution degree-of-freedom estimate, specified as the comma-separated pair consisting of 'DoF0' and a positive scalar. DoF0 must exceed 2.

Data Types: double

'E0' — Presample innovationsnumeric column vector

Presample innovations that have mean 0 and provide initial values for the ARIMA error model, specified as the comma-separated pair consisting of 'E0' and a numeric column vector. E0 must contain at least Mdl.Q rows. If E0 contains extra rows, then estimate uses the latest Mdl.Q presample innovations. The last row contains the latest presample innovation.

By default, estimate sets the necessary presample innovations to 0.

Data Types: double

'Intercept0' — Initial regression model intercept estimatescalar

Initial regression model intercept estimate, specified as the comma-separated pair consisting of 'Intercept0' and a scalar.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

'MA0' — Initial estimates of ARIMA error model nonseasonal moving average coefficientsnumeric vector

Initial estimates of ARIMA error model nonseasonal moving average coefficients, specified as the comma-separated pair consisting of 'MA0' and a numeric vector.

The number of coefficients in MA0 must equal the number of lags associated with nonzero coefficients in the nonseasonal moving average polynomial.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

'Options' — Optimization optionsoptimoptions optimization controller | optimset optimization controller

Optimization options, specified as the comma-separated pair consisting of 'Options' and an optimoptions or optimset optimization controller. For details on altering the default values of the optimizer, see optimoptions, optimset, or fmincon in Optimization Toolbox™.

For example, suppose that you want to change the constraint tolerance to 1e-6. Set Options = optimoptions(@fmincon,'TolCon',1e-6,'Algorithm','sqp'), and then pass Options into estimate using 'Options',Options.

By default, estimate uses the same default options as fmincon, except Algorithm = sqp and TolCon = 1e-7.

'SAR0' — Initial estimates of ARIMA error model seasonal autoregressive coefficientsnumeric vector

Initial estimates of ARIMA error model seasonal autoregressive coefficients, specified as the comma-separated pair consisting of 'SAR0' and a numeric vector.

The number of coefficients in SAR0 must equal the number of lags associated with nonzero coefficients in the seasonal autoregressive polynomial.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

'SMA0' — Initial estimates of ARIMA error model seasonal moving average coefficientsnumeric vector

Initial estimates of ARIMA error model seasonal moving average coefficients, specified as the comma-separated pair consisting of 'SMA0' and a numeric vector.

The number of coefficients in SMA0 must equal the number of lags with nonzero coefficients in the seasonal moving average polynomial.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

'U0' — Presample unconditional disturbancesnumeric column vector

Presample unconditional disturbances that provide initial values for the ARIMA error model, specified as the comma-separated pair consisting of 'U0' and a numeric column vector. U0 must contain at least Mdl.P rows. If U0 contains extra rows, then estimate uses the latest presample unconditional disturbances. The last row contains the latest presample unconditional disturbance.

By default, estimate backcasts for the necessary amount of presample unconditional disturbances.

Data Types: double

'Variance0' — Initial estimate of ARIMA error model innovation variancepositive scalar

Initial estimate of ARIMA error model innovation variance, specified as the comma-separated pair consisting of 'Variance0' and a positive scalar.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

'X' — Predictor datamatrix

Predictor data in the regression model, specified as the comma-separated pair consisting of 'X' and a matrix.

The columns of X are separate, synchronized time series, with the last row containing the latest observations. The number of rows of X must be at least the length of y. If the number of rows of X exceeds the number required, then estimate uses the latest observations.

By default, estimate does not estimate the regression coefficients regardless of their presence in Mdl.

Data Types: double

    Notes  

    • NaNs in y, E0, U0, and X indicate missing values, and estimate removes them. The software merges the presample data (E0 and U0) separately from the effective sample data (X and y), then uses list-wise deletion to remove any NaNs. Removing NaNs in the data reduces the sample size, and can also create irregular time series.

    • estimate assumes that you synchronize the data (presample separately from effective sample) such that the latest observations occur simultaneously.

    • The intercept of a regression model with ARIMA errors having nonzero degrees of seasonal or nonseasonal integration is not identifiable. In other words, estimate cannot estimate an intercept of a regression model with ARIMA errors that has nonzero degrees of seasonal or nonseasonal integration. If you pass in such a model for estimation, estimate displays a warning in the Command Window and sets EstMdl.Intercept to NaN.

    • If you specify a value for Display, then it takes precedence over the specifications of the optimization options Diagnostics and Display. Otherwise, estimate honors all selections related to the display of optimization information in the optimization options.

Output Arguments

expand all

EstMdl — Model containing parameter estimatesregARIMA model

Model containing the parameter estimates, returned as a regARIMA model. estimate uses maximum likelihood to calculate all parameter estimates not constrained by Mdl (that is, all parameters in Mdl that you set to NaN).

EstParamCov — Variance-covariance matrix of maximum likelihood estimatesmatrix

Variance-covariance matrix of maximum likelihood estimates of model parameters known to the optimizer, returned as a matrix.

The rows and columns contain the covariances of the parameter estimates. The standard errors of the parameter estimates are the square root of the entries along the main diagonal. The rows and columns associated with any parameters held fixed as equality constraints contain 0s.

estimate uses the outer product of gradients (OPG) method to perform covariance matrix estimation.

estimate orders the parameters in EstParamCov as follows:

  • Intercept

  • Nonzero AR coefficients at positive lags

  • Nonzero SAR coefficients at positive lags

  • Nonzero MA coefficients at positive lags

  • Nonzero SMA coefficients at positive lags

  • Regression coefficients (when you specify X in estimate)

  • Innovations variance

  • Degrees of freedom for the t distribution

Data Types: double

logL — Optimized loglikelihood objective function valuescalar

Optimized loglikelihood objective function value, returned as a scalar.

Data Types: double

info — Summary informationstructure

Summary information, returned as a structure.

FieldDescription
exitflagOptimization exit flag (see fmincon in Optimization Toolbox)
optionsOptimization options controller (see optimoptions and fmincon in Optimization Toolbox)
XVector of final parameter estimates
X0Vector of initial parameter estimates

For example, you can display the vector of final estimates by typing info.X in the Command Window.

Data Types: struct

Examples

expand all

Estimate Parameters of a Regression Model with ARIMA Errors Without Initial Values

Fit this regression model with ARMA(2,1) errors to simulated data:

$$\begin{array}{*{20}{l}}
\begin{array}{c}
{y_t} = {X_t}\left[ {\begin{array}{*{20}{c}}
{0.1}\\
{ - 0.2}
\end{array}} \right] + {u_t}\\
{u_t} = 0.5{u_{t - 1}} - 0.8{u_{t - 2}} + {\varepsilon _t} - 0.5{\varepsilon _{t - 1}},
\end{array}
\end{array}$$

where $\varepsilon_{t}$ is Gaussian with variance 0.1.

Specify the regression model ARMA(2,1) errors. Simulate responses from the model and two predictor series.

Mdl = regARIMA('Intercept',0,'AR',{0.5 -0.8}, ...
    'MA',-0.5,'Beta',[0.1 -0.2],'Variance',0.1);
rng(1);
X =  randn(100,2);
y = simulate(Mdl,100,'X',X);

Specify a regression model with ARMA(2,1) errors with no intercept, and unknown coefficients and variance.

ToEstMdl = regARIMA(2,0,1);
ToEstMdl.Intercept = 0 % Exclude the intercept
ToEstMdl = 

    ARIMA(2,0,1) Error Model:
    --------------------------
    Distribution: Name = 'Gaussian'
       Intercept: 0
               P: 2
               D: 0
               Q: 1
              AR: {NaN NaN} at Lags [1 2]
             SAR: {}
              MA: {NaN} at Lags [1]
             SMA: {}
        Variance: NaN

The AR coefficients, MA coefficients, and the innovation variance are NaN values. estimate estimates those parameters, but not the intercept. The intercept is held fixed at 0.

Fit the regression model with ARMA(2,1) errors to the data.

EstMdl = estimate(ToEstMdl,y,'X',X,'Display','params');
 
    Regression with ARIMA(2,0,1) Error Model:
    ------------------------------------------
    Conditional Probability Distribution: Gaussian

                                  Standard          t     
     Parameter       Value          Error       Statistic 
    -----------   -----------   ------------   -----------
    Intercept              0         Fixed          Fixed
        AR{1}       0.620303      0.104194        5.95338
        AR{2}      -0.697172     0.0795748       -8.76122
        MA{1}      -0.558083      0.131897       -4.23122
        Beta1       0.103667     0.0217347        4.76964
        Beta2      -0.209448     0.0241883       -8.65904
     Variance      0.0748852    0.00903584        8.28758

The result, EstMdl, is a new regARIMA model. The estimates in EstMdl resemble the parameter values that generated the simulated data.

Estimate Parameters of a Regression Model with ARIMA Errors Using Initial Values

Fit a regression model with ARMA(1,1) errors by regressing the log GDP onto the CPI and using initial values.

Load the US Macroeconomic data set and preprocess the data.

load Data_USEconModel;
logGDP = log(DataTable.GDP);
dlogGDP = diff(logGDP);        % For stationarity
dCPI = diff(DataTable.CPIAUCSL); % For stationarity
T = length(dlogGDP);           % Effective sample size

Specify an "empty" regression model with ARMA(1,1) errors.

ToEstMdl = regARIMA(1,0,1);

Fit the model to the first half of the data.

EstMdl0 = estimate(ToEstMdl,dlogGDP(1:ceil(T/2)),...
    'X',dCPI(1:ceil(T/2)),'Display','off');

The result is a new regARIMA model with the estimated parameters.

Use the estimated parameters as initial values for fitting the second half of the data.

Intercept0 = EstMdl0.Intercept;
AR0        = EstMdl0.AR{1};
MA0        = EstMdl0.MA{1};
Variance0  = EstMdl0.Variance;
Beta0      = EstMdl0.Beta;

[EstMdl,~,~,info] = estimate(ToEstMdl,...
   dlogGDP(floor(T/2)+1:end),'X',...
   dCPI(floor(T/2)+1:end),'Display','params',...
   'Intercept0',Intercept0,'AR0',AR0,'MA0',MA0,...
   'Variance0',Variance0,'Beta0',Beta0);
 
    Regression with ARIMA(1,0,1) Error Model:
    ------------------------------------------
    Conditional Probability Distribution: Gaussian

                                  Standard          t     
     Parameter       Value          Error       Statistic 
    -----------   -----------   ------------   -----------
    Intercept      0.0111738    0.00210199         5.3158
        AR{1}       0.786836     0.0362291        21.7184
        MA{1}      -0.473619     0.0655402       -7.22639
        Beta1     0.00219331   0.000583268        3.76038
     Variance    4.83486e-05    4.1705e-06         11.593

Display all of the parameter estimates using info.X.

info.X
ans =

    0.0112
    0.7868
   -0.4736
    0.0022
    0.0000

The order of the parameter estimates in info.X matches the order that estimate displays in its output table.

Tip

Suppose EstParamCov is an estimated parameter covariance matrix returned by estimate. The software sets the variances and covariances of parameters fixed during estimation to 0. Enter this command to count the number of free parameters (numParams) in a fitted model.

numParams = sum(any(EstParamCov))

This command counts the number of columns (or equivalently, rows) with any nonzero values.

Algorithms

estimate estimates the parameters as follows:

  1. Infer the unconditional disturbances from the regression model.

  2. Infer the residuals of the ARIMA error model.

  3. Use the distribution of the innovations to build the likelihood function.

  4. Maximize the loglikelihood function with respect to the parameters using fmincon.

References

[1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Davidson, R., and J. G. MacKinnon. Econometric Theory and Methods. Oxford, UK: Oxford University Press, 2004.

[3] Enders, W. Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, Inc., 1995.

[4] Hamilton, J. D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

[5] Pankratz, A. Forecasting with Dynamic Regression Models. John Wiley & Sons, Inc., 1991.

[6] Tsay, R. S. Analysis of Financial Time Series. 2nd ed. Hoboken, NJ: John Wiley & Sons, Inc., 2005.

Was this topic helpful?