Main Content

estimate

Class: regARIMA

Estimate parameters of regression models with ARIMA errors

Syntax

EstMdl = estimate(Mdl,y)
[EstMdl,EstParamCov,logL,info] = estimate(Mdl,y)
[EstMdl,EstParamCov,logL,info] = estimate(Mdl,y,Name,Value)

Description

EstMdl = estimate(Mdl,y) uses maximum likelihood to estimate the parameters of the regression model with ARIMA time series errors, Mdl, given the response series y. EstMdl is a regARIMA model that stores the results.

[EstMdl,EstParamCov,logL,info] = estimate(Mdl,y) additionally returns EstParamCov, the variance-covariance matrix associated with estimated parameters, logL, the optimized loglikelihood objective function, and info, a data structure of summary information.

[EstMdl,EstParamCov,logL,info] = estimate(Mdl,y,Name,Value) estimates the model using additional options specified by one or more Name,Value pair arguments.

Input Arguments

expand all

Regression model with ARIMA errors, specified as a regARIMA model returned by regARIMA or estimate.

estimate treats non-NaN elements in Mdl as equality constraints, and does not estimate the corresponding parameters.

Single path of response data to which the model is fit, specified as a numeric column vector. The last observation of y is the latest.

Data Types: double

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Initial estimates of ARIMA error model nonseasonal autoregressive coefficients, specified as the comma-separated pair consisting of 'AR0' and a numeric vector.

The number of coefficients in AR0 must equal the number of lags associated with nonzero coefficients in the nonseasonal autoregressive polynomial.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

Initial estimates of regression coefficients, specified as the comma-separated pair consisting of 'Beta0' and a numeric vector.

The number of coefficients in Beta0 must equal the number of columns of X.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

Command Window display option, specified as one or more of the values in this table.

ValueInformation Displayed
"diagnostics"Optimization diagnostics
"full"Maximum likelihood parameter estimates, standard errors, t statistics, iterative optimization information, and optimization diagnostics
"iter"Iterative optimization information
"off"None
"params"Maximum likelihood parameter estimates, standard errors, and t statistics and p-values of coefficient significance tests

Example: Display="off" is well suited for running a simulation that estimates many models.

Example: Display=["params" "diagnostics"] displays all estimation results and the optimization diagnostics.

Data Types: char | cell | string

Initial t-distribution degree-of-freedom estimate, specified as the comma-separated pair consisting of 'DoF0' and a positive scalar. DoF0 must exceed 2.

Data Types: double

Presample innovations that have mean 0 and provide initial values for the ARIMA error model, specified as the comma-separated pair consisting of 'E0' and a numeric column vector. E0 must contain at least Mdl.Q rows. If E0 contains extra rows, then estimate uses the latest Mdl.Q presample innovations. The last row contains the latest presample innovation.

By default, estimate sets the necessary presample innovations to 0.

Data Types: double

Initial regression model intercept estimate, specified as the comma-separated pair consisting of 'Intercept0' and a scalar.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

Initial estimates of ARIMA error model nonseasonal moving average coefficients, specified as the comma-separated pair consisting of 'MA0' and a numeric vector.

The number of coefficients in MA0 must equal the number of lags associated with nonzero coefficients in the nonseasonal moving average polynomial.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

Optimization options, specified as an optimoptions optimization controller. For details on modifying the default values of the optimizer, see optimoptions or fmincon in Optimization Toolbox™.

For example, to change the constraint tolerance to 1e-6, set options = optimoptions(@fmincon,ConstraintTolerance=1e-6,Algorithm="sqp"). Then, pass Options into estimate using Options=options.

By default, estimate uses the same default options as fmincon, except Algorithm is "sqp" and ConstraintTolerance is 1e-7.

Initial estimates of ARIMA error model seasonal autoregressive coefficients, specified as the comma-separated pair consisting of 'SAR0' and a numeric vector.

The number of coefficients in SAR0 must equal the number of lags associated with nonzero coefficients in the seasonal autoregressive polynomial.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

Initial estimates of ARIMA error model seasonal moving average coefficients, specified as the comma-separated pair consisting of 'SMA0' and a numeric vector.

The number of coefficients in SMA0 must equal the number of lags with nonzero coefficients in the seasonal moving average polynomial.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

Presample unconditional disturbances that provide initial values for the ARIMA error model, specified as the comma-separated pair consisting of 'U0' and a numeric column vector. U0 must contain at least Mdl.P rows. If U0 contains extra rows, then estimate uses the latest presample unconditional disturbances. The last row contains the latest presample unconditional disturbance.

By default, estimate backcasts for the necessary amount of presample unconditional disturbances.

Data Types: double

Initial estimate of ARIMA error model innovation variance, specified as the comma-separated pair consisting of 'Variance0' and a positive scalar.

By default, estimate derives initial estimates using standard time series techniques.

Data Types: double

Predictor data in the regression model, specified as the comma-separated pair consisting of 'X' and a matrix.

The columns of X are separate, synchronized time series, with the last row containing the latest observations. The number of rows of X must be at least the length of y. If the number of rows of X exceeds the number required, then estimate uses the latest observations.

By default, estimate does not estimate the regression coefficients regardless of their presence in Mdl.

Data Types: double

Notes

  • NaNs in y, E0, U0, and X indicate missing values, and estimate removes them. The software merges the presample data (E0 and U0) separately from the effective sample data (X and y), then uses list-wise deletion to remove any NaNs. Removing NaNs in the data reduces the sample size, and can also create irregular time series.

  • estimate assumes that you synchronize the data (presample separately from effective sample) such that the latest observations occur simultaneously.

  • The intercept of a regression model with ARIMA errors having nonzero degrees of seasonal or nonseasonal integration is not identifiable. In other words, estimate cannot estimate an intercept of a regression model with ARIMA errors that has nonzero degrees of seasonal or nonseasonal integration. If you pass in such a model for estimation, estimate displays a warning in the Command Window and sets EstMdl.Intercept to NaN.

  • If you specify a value for Display, then it takes precedence over the specifications of the optimization options Diagnostics and Display. Otherwise, estimate honors all selections related to the display of optimization information in the optimization options.

Output Arguments

expand all

Model containing the parameter estimates, returned as a regARIMA model. estimate uses maximum likelihood to calculate all parameter estimates not constrained by Mdl (that is, all parameters in Mdl that you set to NaN).

Variance-covariance matrix of maximum likelihood estimates of model parameters known to the optimizer, returned as a matrix.

The rows and columns contain the covariances of the parameter estimates. The standard errors of the parameter estimates are the square root of the entries along the main diagonal. The rows and columns associated with any parameters held fixed as equality constraints contain 0s.

estimate uses the outer product of gradients (OPG) method to perform covariance matrix estimation.

estimate orders the parameters in EstParamCov as follows:

  • Intercept

  • Nonzero AR coefficients at positive lags

  • Nonzero SAR coefficients at positive lags

  • Nonzero MA coefficients at positive lags

  • Nonzero SMA coefficients at positive lags

  • Regression coefficients (when you specify X in estimate)

  • Innovations variance

  • Degrees of freedom for the t distribution

Data Types: double

Optimized loglikelihood objective function value, returned as a scalar.

Data Types: double

Summary information, returned as a structure.

FieldDescription
exitflagOptimization exit flag (see fmincon in Optimization Toolbox)
optionsOptimization options controller (see optimoptions and fmincon in Optimization Toolbox)
XVector of final parameter estimates
X0Vector of initial parameter estimates

For example, you can display the vector of final estimates by typing info.X in the Command Window.

Data Types: struct

Examples

expand all

Fit this regression model with ARMA(2,1) errors to simulated data:

yt=Xt[0.1-0.2]+utut=0.5ut-1-0.8ut-2+εt-0.5εt-1,

where εt is Gaussian with variance 0.1.

Specify the regression model ARMA(2,1) errors. Simulate responses from the model and two predictor series.

Mdl0 = regARIMA('Intercept',0,'AR',{0.5 -0.8}, ...
    'MA',-0.5,'Beta',[0.1 -0.2],'Variance',0.1);
rng(1);
X =  randn(100,2);
y = simulate(Mdl0,100,'X',X);

Specify a regression model with ARMA(2,1) errors with no intercept, and unknown coefficients and variance.

Mdl = regARIMA(2,0,1);
Mdl.Intercept = 0      % Exclude the intercept
Mdl = 
  regARIMA with properties:

     Description: "ARMA(2,1) Error Model (Gaussian Distribution)"
    Distribution: Name = "Gaussian"
       Intercept: 0
            Beta: [1×0]
               P: 2
               Q: 1
              AR: {NaN NaN} at lags [1 2]
             SAR: {}
              MA: {NaN} at lag [1]
             SMA: {}
        Variance: NaN

The AR coefficients, MA coefficients, and the innovation variance are NaN values. estimate estimates those parameters, but not the intercept. The intercept is held fixed at 0.

Fit the regression model with ARMA(2,1) errors to the data.

EstMdl = estimate(Mdl,y,'X',X,'Display','params');
 
    Regression with ARMA(2,1) Error Model (Gaussian Distribution):
 
                  Value      StandardError    TStatistic      PValue  
                 ________    _____________    __________    __________

    Intercept           0              0           NaN             NaN
    AR{1}          0.6203        0.10419        5.9534      2.6267e-09
    AR{2}        -0.69717       0.079575       -8.7612      1.9315e-18
    MA{1}        -0.55808         0.1319       -4.2312      2.3243e-05
    Beta(1)       0.10367       0.021735        4.7696      1.8456e-06
    Beta(2)      -0.20945       0.024188        -8.659      4.7574e-18
    Variance     0.074885      0.0090358        8.2876      1.1558e-16

The result, EstMdl, is a new regARIMA model. The estimates in EstMdl resemble the parameter values that generated the simulated data.

Fit a regression model with ARMA(1,1) errors by regressing the log GDP onto the CPI and using initial values.

Load the US Macroeconomic data set and preprocess the data.

load Data_USEconModel
logGDP = log(DataTimeTable.GDP);
dlogGDP = diff(logGDP);                 % For stationarity
dCPI = diff(DataTimeTable.CPIAUCSL);    % For stationarity
T = length(dlogGDP);                    % Effective sample size

Specify a regression model with ARMA(1,1) errors in which all estimable parameters are unknown.

EstMdl = regARIMA(1,0,1);

Fit the model to the first half of the data.

EstMdl0 = estimate(EstMdl,dlogGDP(1:ceil(T/2)),...
    'X',dCPI(1:ceil(T/2)),'Display','off');

The result is a new regARIMA model with the estimated parameters.

Use the estimated parameters as initial values for fitting the second half of the data.

Intercept0 = EstMdl0.Intercept;
AR0        = EstMdl0.AR{1};
MA0        = EstMdl0.MA{1};
Variance0  = EstMdl0.Variance;
Beta0      = EstMdl0.Beta;

[EstMdl,~,~,info] = estimate(EstMdl,dlogGDP(floor(T/2)+1:end),...
    'X',dCPI(floor(T/2)+1:end),'Display','params',...
   'Intercept0',Intercept0,'AR0',AR0,'MA0',MA0,...
   'Variance0',Variance0,'Beta0',Beta0);
 
    Regression with ARMA(1,1) Error Model (Gaussian Distribution):
 
                   Value       StandardError    TStatistic      PValue   
                 __________    _____________    __________    ___________

    Intercept      0.011174       0.002102        5.3158       1.0619e-07
    AR{1}           0.78684       0.036229        21.718      1.3759e-104
    MA{1}          -0.47362        0.06554       -7.2264         4.96e-13
    Beta(1)       0.0021933     0.00058327        3.7604       0.00016966
    Variance     4.8349e-05     4.1705e-06        11.593       4.4716e-31

Display all of the parameter estimates using info.X.

info.X
ans = 5×1

    0.0112
    0.7868
   -0.4736
    0.0022
    0.0000

The order of the parameter estimates in info.X matches the order that estimate displays in its output table.

Tips

  • To access values of the estimation results, including the number of free parameters in the model, pass EstMdl to summarize.

Algorithms

estimate estimates the parameters as follows:

  1. Infer the unconditional disturbances from the regression model.

  2. Infer the residuals of the ARIMA error model.

  3. Use the distribution of the innovations to build the likelihood function.

  4. Maximize the loglikelihood function with respect to the parameters using fmincon.

References

[1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Davidson, R., and J. G. MacKinnon. Econometric Theory and Methods. Oxford, UK: Oxford University Press, 2004.

[3] Enders, W. Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, Inc., 1995.

[4] Hamilton, J. D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

[5] Pankratz, A. Forecasting with Dynamic Regression Models. John Wiley & Sons, Inc., 1991.

[6] Tsay, R. S. Analysis of Financial Time Series. 2nd ed. Hoboken, NJ: John Wiley & Sons, Inc., 2005.