Estimate ARIMA or ARIMAX model parameters

`EstMdl = estimate(Mdl,y)`

[EstMdl,EstParamCov,logL,info]
= estimate(Mdl,y)

[EstMdl,EstParamCov,logL,info] = estimate(Mdl,y,Name,Value)

uses maximum likelihood to estimate the parameters of the ARIMA(`EstMdl`

= estimate(`Mdl`

,`y`

)*p*,*D*,*q*) model `Mdl`

given the observed univariate time series `y`

. `EstMdl`

is an `arima`

model that stores the results.

`[EstMdl,`

additionally returns `EstParamCov`

,`logL`

,`info`

]
= estimate(Mdl,y)`EstParamCov`

, the variance-covariance matrix associated with estimated parameters, `logL`

, the optimized loglikelihood objective function, and `info`

, a data structure of summary information.

`[EstMdl,EstParamCov,logL,info] = estimate(Mdl,y,`

estimates the model with additional options specified by one or more `Name,Value`

)`Name,Value`

pair arguments.

`Mdl`

— ARIMA or ARIMAX model`arima`

modelARIMA or ARIMAX model, specified as an `arima`

model returned by `arima`

or `estimate`

.

`estimate`

treats non-`NaN`

elements in `Mdl`

as equality constraints and does not estimate the corresponding parameters.

`y`

— Single path of response datanumeric column vector

Single path of response data to which the model is fit, specified as a numeric column vector. The last observation of `y`

is the latest.

**Data Types: **`double`

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

`'AR0'`

— Initial estimates of nonseasonal autoregressive coefficientsnumeric vector

Initial estimates of the nonseasonal autoregressive coefficients for the ARIMA model, specified as the comma-separated pair consisting of `'AR0'`

and a numeric vector.

The number of coefficients in `AR0`

must equal the number of lags associated with nonzero coefficients in the nonseasonal autoregressive polynomial, `ARLags`

.

By default, `estimate`

derives initial estimates using standard time series techniques.

**Data Types: **`double`

`'Beta0'`

— Initial estimates of regression coefficientsnumeric vector

Initial estimates of regression coefficients for the regression component, specified as the comma-separated pair consisting of `'Beta0'`

and a numeric vector.

The number of coefficients in `Beta0`

must equal the number of columns of `X`

.

By default, `estimate`

derives initial estimates using standard time series techniques.

**Data Types: **`double`

`'Constant0'`

— Initial ARIMA model constant estimatescalar

Initial ARIMA model constant estimate, specified as the comma-separated pair consisting of `'Constant0'`

and a scalar.

By default, `estimate`

derives initial estimates using standard time series techniques.

**Data Types: **`double`

`'Display'`

— Command Window display option`'params'`

(default) | `'diagnostics'`

| `'full'`

| `'iter'`

| `'off'`

| string vector | cell vector of character vectorsCommand Window display option, specified as the comma-separated
pair consisting of `'Display'`

and a value or any
combination of values in this table.

Value | estimate Displays |
---|---|

`'diagnostics'` | Optimization diagnostics |

`'full'` | Maximum likelihood parameter estimates, standard errors, t statistics,
iterative optimization information, and optimization diagnostics |

`'iter'` | Iterative optimization information |

`'off'` | No display in the Command Window |

`'params'` | Maximum likelihood parameter estimates, standard errors, and t statistics |

For example:

To run a simulation where you are fitting many models, and therefore want to suppress all output, use

`'Display','off'`

.To display all estimation results and the optimization diagnostics, use

`'Display',{'params','diagnostics'}`

.

**Data Types: **`char`

| `cell`

| `string`

`'DoF0'`

— Initial `10`

(default) | positive scalarInitial *t*-distribution degrees-of-freedom
parameter estimate, specified as the comma-separated pair consisting
of `'DoF0'`

and a positive scalar. `DoF0`

must
exceed 2.

**Data Types: **`double`

`'E0'`

— Presample innovationsnumeric column vector

Presample innovations that have mean 0 and provide initial values for the ARIMA(*p*,*D*,*q*) model, specified as the comma-separated pair consisting of `'E0'`

and a numeric column vector.

`E0`

must contain at least `Mdl.Q`

rows. If you use a conditional variance model, such as a `garch`

model, then the software might require more than `Mdl.Q`

presample innovations.

If `E0`

contains extra rows, then `estimate`

uses the latest `Mdl.Q`

presample innovations. The last row contains the latest presample innovation.

By default, `estimate`

sets the necessary presample innovations to `0`

.

**Data Types: **`double`

`'MA0'`

— Initial estimates of nonseasonal moving average coefficientsnumeric vector

Initial estimates of nonseasonal moving average coefficients for the ARIMA(*p*,*D*,*q*) model, specified as the comma-separated pair consisting of `'MA0'`

and a numeric vector.

The number of coefficients in `MA0`

must equal the number of lags associated with nonzero coefficients in the nonseasonal moving average polynomial, `MALags`

.

By default, `estimate`

derives initial estimates using standard time series techniques.

**Data Types: **`double`

`'Options'`

— Optimization options`optimoptions`

optimization controllerOptimization options, specified as the comma-separated pair consisting of
`'Options'`

and an `optimoptions`

optimization
controller. For details on altering the default values of the optimizer, see `optimoptions`

or `fmincon`

in Optimization
Toolbox™.

For example, to change the constraint tolerance to `1e-6`

,
set `Options = optimoptions(@fmincon,'ConstraintTolerance',1e-6,'Algorithm','sqp')`

.
Then, pass `Options`

into `estimate`

using `'Options',Options`

.

By default, `estimate`

uses the same default
options as `fmincon`

, except `Algorithm`

is `'sqp'`

and `ConstraintTolerance`

is `1e-7`

.

`'SAR0'`

— Initial estimates of seasonal autoregressive coefficientsnumeric vector

Initial estimates of seasonal autoregressive coefficients for the ARIMA(*p*,*D*,*q*) model, specified as the comma-separated pair consisting of `'SAR0'`

and a numeric vector.

The number of coefficients in `SAR0`

must equal the number of lags associated with nonzero coefficients in the seasonal autoregressive polynomial, `SARLags`

.

By default, `estimate`

derives initial estimates using standard time series techniques.

**Data Types: **`double`

`'SMA0'`

— Initial estimates of seasonal moving average coefficientsnumeric vector

Initial estimates of seasonal moving average coefficients for the ARIMA(*p*,*D*,*q*) model, specified as the comma-separated pair consisting of `'SMA0'`

and a vector.

The number of coefficients in `SMA0`

must equal the number of lags with nonzero coefficients in the seasonal moving average polynomial, `SMALags`

.

By default, `estimate`

derives initial estimates using standard time series techniques.

**Data Types: **`double`

`'V0'`

— Presample conditional variancesnumeric column vector with positive entries

Presample conditional variances that provide initial values for any conditional variance model, specified as the comma-separated pair consisting of `'V0'`

and a numeric column vector with positive entries.

The software requires `V0`

to have at least the number of observations required to initialize the variance model. If the number of rows in `V0`

exceeds the number necessary, then `estimate`

only uses the latest observations. The last row contains the latest observation.

If the variance of the model is constant, then `V0`

is unnecessary.

By default, `estimate`

sets the necessary presample conditional variances to the average of the squared inferred residuals.

**Data Types: **`double`

`'Variance0'`

— Initial estimates of variances of innovationspositive scalar | cell vector of name-value pair arguments

Initial estimates of variances of innovations for the ARIMA(*p*,*D*,*q*) model, specified as the comma-separated pair consisting of `'Variance0'`

and a positive scalar or a cell vector of name-value pair arguments.

If

`Variance0`

is a positive scalar, the variance of`Mdl`

(stored in`Mdl.Variance`

) must be constant.If

`Variance0`

is a cell vector:

By default, `estimate`

derives initial estimates using standard time series techniques.

**Example: **For a model with a constant variance, set `'Variance0',2`

to specify an initial estimate of `2`

for the model variance.

**Example: **For a composite conditional mean and variance model, set `'Variance0',{'Constant0',2,'ARCH0',0.1}`

to specify an initial estimate of `2`

for the conditional variance model constant, and an initial estimate of `0.1`

for the lag 1 coefficient in the ARCH polynomial.

**Data Types: **`double`

| `cell`

`'X'`

— Exogenous predictorsmatrix

Exogenous predictors in the regression model, specified as the comma-separated pair consisting of `'X'`

and a matrix.

The columns of `X`

are separate, synchronized time series, with the last row containing the latest observations.

If you do not specify `Y0`

, then the number of rows of `X`

must be at least `numel(y,2) + Mdl.P`

. Otherwise, the number of rows of `X`

should be at least the length of `y`

.

If the number of rows of `X`

exceeds the number necessary, then `estimate`

uses the latest observations and synchronizes `X`

with the response series `y`

.

By default, `estimate`

does not estimate the regression coefficients regardless of their presence in `Mdl`

.

**Data Types: **`double`

`'Y0'`

— Presample response datanumeric column vector

Presample response data that provides initial values for the ARIMA(*p*,*D*,*q*) model, specified as the comma-separated pair consisting of `'Y0'`

and a numeric column vector.

`Y0`

is a column vector with at least `Mdl.P`

rows. If the number of rows in `Y0`

exceeds `Mdl.P`

, `estimate`

only uses the latest `Mdl.P`

observations. The last row contains the latest observation.

By default, `estimate`

backward forecasts for the necessary amount of presample observations.

**Data Types: **`double`

`NaN`

s indicate missing values, and`estimate`

removes them. The software merges the presample data (`E0`

,`V0`

, and`Y0`

) separately from the effective sample data (`X`

and`y`

), then uses list-wise deletion to remove any`NaN`

s. Removing`NaN`

s in the data reduces the sample size, and can also create irregular time series.Removing

`NaN`

s in the data reduces the sample size, and can also create irregular time series.`estimate`

assumes that you synchronize the response and exogenous predictors such that the last (latest) observation of each occurs simultaneously. The software also assumes that you synchronize the presample series similarly.If you specify a value for

`Display`

, then it takes precedence over the specifications of the optimization options`Diagnostics`

and`Display`

. Otherwise,`estimate`

honors all selections related to the display of optimization information in the optimization options.

`EstMdl`

— Model containing parameter estimates`arima`

modelModel containing parameter estimates, returned as an `arima`

model. `estimate`

uses maximum likelihood to calculate all parameter estimates not constrained by `Mdl`

(that is, all parameters in `Mdl`

that you set to `NaN`

).

`EstParamCov`

— Variance-covariance matrix of maximum likelihood estimatesmatrix

Variance-covariance matrix of maximum likelihood estimates of model parameters known to the optimizer, returned as a matrix.

The rows and columns contain the covariances of the parameter estimates. The standard errors of the parameter estimates are the square root of the entries along the main diagonal.

The rows and columns associated with any parameters held fixed as equality constraints contain `0`

s.

`estimate`

uses the outer product of gradients (OPG) method to perform covariance matrix estimation.

`estimate`

orders the parameters in `EstParamCov`

as follows:

Constant

Nonzero

`AR`

coefficients at positive lagsNonzero

`SAR`

coefficients at positive lagsNonzero

`MA`

coefficients at positive lagsNonzero

`SMA`

coefficients at positive lagsRegression coefficients (when you specify

`X`

in`estimate`

)Variance parameters (scalar for constant-variance models, vector of additional parameters otherwise)

Degrees of freedom (

*t*innovation distribution only)

**Data Types: **`double`

`logL`

— Optimized loglikelihood objective function valuescalar

Optimized loglikelihood objective function value, returned as a scalar.

**Data Types: **`double`

`info`

— Summary informationstructure array

Summary information, returned as a structure.

Field | Description |
---|---|

`exitflag` | Optimization exit flag (see `fmincon` in Optimization
Toolbox) |

`options` | Optimization options controller (see `optimoptions` and `fmincon` in Optimization
Toolbox) |

`X` | Vector of final parameter estimates |

`X0` | Vector of initial parameter estimates |

For example, you can display the vector of final estimates by
typing `info.X`

in the Command Window.

**Data Types: **`struct`

Fit an ARMA(2,1) model to simulated data.

Simulate 500 data points from the ARMA(2,1) model

$${y}_{t}=0.5{y}_{t-1}-0.3{y}_{t-2}+{\epsilon}_{t}+0.2{\epsilon}_{t-1},$$

where $${\epsilon}_{t}$$ follows a Gaussian distribution with mean 0 and variance 0.1.

Mdl0 = arima('AR',{0.5,-0.3},'MA',0.2,... 'Constant',0,'Variance',0.1); rng(5); % For reproducibility y = simulate(Mdl0,500);

The simulated data is stored in the column vector `Y`

.

Specify an ARMA(2,1) model with no constant and unknown coefficients and variance.

Mdl = arima(2,0,1); Mdl.Constant = 0

Mdl = arima with properties: Description: "ARIMA(2,0,1) Model (Gaussian Distribution)" Distribution: Name = "Gaussian" P: 2 D: 0 Q: 1 Constant: 0 AR: {NaN NaN} at lags [1 2] SAR: {} MA: {NaN} at lag [1] SMA: {} Seasonality: 0 Beta: [1×0] Variance: NaN

Fit the ARMA(2,1) model to `y`

.

EstMdl = estimate(Mdl,y);

ARIMA(2,0,1) Model (Gaussian Distribution): Value StandardError TStatistic PValue ________ _____________ __________ __________ Constant 0 0 NaN NaN AR{1} 0.49404 0.10321 4.7866 1.6961e-06 AR{2} -0.25348 0.06993 -3.6248 0.00028921 MA{1} 0.27958 0.10721 2.6078 0.0091132 Variance 0.10009 0.0066403 15.073 2.4228e-51

The result is a new `arima`

model called `EstMdl`

. The estimates in `EstMdl`

resemble the parameter values that generated the simulated data.

Fit an integrated ARIMA(1,1,1) model to the daily close of the NASDAQ Composite Index.

Load the NASDAQ data included with the toolbox. Extract the first 1500 observations of the Composite Index (January 1990 to December 1995).

```
load Data_EquityIdx
nasdaq = DataTable.NASDAQ(1:1500);
```

Specify an ARIMA(1,1,1) model for fitting.

Mdl = arima(1,1,1);

The model is nonseasonal, so you can use shorthand syntax.

Fit the model to the first half of the data.

EstMdl = estimate(Mdl,nasdaq(1:750));

ARIMA(1,1,1) Model (Gaussian Distribution): Value StandardError TStatistic PValue _______ _____________ __________ ___________ Constant 0.2234 0.18418 1.213 0.22515 AR{1} 0.11434 0.11944 0.95733 0.3384 MA{1} 0.12764 0.11925 1.0703 0.28448 Variance 18.983 0.68999 27.512 1.2543e-166

The result is a new `arima`

model (`EstMdl`

). The estimated parameters, their standard errors, and $$t$$ statistics display in the Command Window.

Use the estimated parameters as initial values for fitting the second half of the data.

con0 = EstMdl.Constant; ar0 = EstMdl.AR{1}; ma0 = EstMdl.MA{1}; var0 = EstMdl.Variance; [EstMdl2,EstParamCov2,logL2,info2] = estimate(Mdl,.... nasdaq(751:end),'Constant0',con0,'AR0',ar0,... 'MA0',ma0,'Variance0',var0);

ARIMA(1,1,1) Model (Gaussian Distribution): Value StandardError TStatistic PValue ________ _____________ __________ ___________ Constant 0.61142 0.32675 1.8712 0.061315 AR{1} -0.15071 0.11782 -1.2792 0.20084 MA{1} 0.38568 0.10905 3.5366 0.0004053 Variance 36.493 1.227 29.742 2.1906e-194

The parameter estimates are stored in the `info`

data structure. Display the final parameter estimates.

info2.X

`ans = `*4×1*
0.6114
-0.1507
0.3857
36.4933

Fit an ARIMAX model to a simulated time series without specifying initial values for the response or the parameters.

Define the ARIMAX(2,1,1) model

$$(1-0.5L+0.3{L}^{2})(1-L{)}^{1}{y}_{t}=1.5{x}_{1,t}+2.6{x}_{2,t}-0.3{x}_{3,t}+{\epsilon}_{t}+0.2{\epsilon}_{t-1}$$

to eventually simulate a time series of length 500, where $${\epsilon}_{t}$$ follows a Gaussian distribution with mean 0 and variance 0.1.

Mdl0 = arima('AR',{0.5,-0.3},'MA',0.2,'D',1,... 'Constant',0,'Variance',0.1,'Beta',[1.5 2.6 -0.3]); T = 500;

Simulate three stationary AR(1) series and presample values:

$$\begin{array}{c}{x}_{1,t}=0.1{x}_{1,t-1}+{\eta}_{1,t}\\ {x}_{2,t}=0.2{x}_{2,t-1}+{\eta}_{2,t}\\ {x}_{3,t}=0.3{x}_{3,t-1}+{\eta}_{3,t},\end{array}$$

where $${\eta}_{i,t}$$ follows a Gaussian distribution with mean 0 and variance 0.01 for *i* = {1,2,3}.

numObs = Mdl0.P + T; MdlX1 = arima('AR',0.1,'Constant',0,'Variance',0.01); MdlX2 = arima('AR',0.2,'Constant',0,'Variance',0.01); MdlX3 = arima('AR',0.3,'Constant',0,'Variance',0.01); X1 = simulate(MdlX1,numObs); X2 = simulate(MdlX2,numObs); X3 = simulate(MdlX3,numObs); Xmat = [X1 X2 X3];

The simulated exogenous predictors are stored in the `numObs`

-by-3 matrix `Xmat`

.

Simulate 500 data points from the ARIMA(2,1,1) model.

`y = simulate(Mdl0,T,'X',Xmat);`

The simulated response is stored in the column vector `y`

.

Create an ARIMA(2,1,1) model with known `0`

-valued constant and unknown coefficients and variance.

Mdl = arima(2,1,1); Mdl.Constant = 0

Mdl = arima with properties: Description: "ARIMA(2,1,1) Model (Gaussian Distribution)" Distribution: Name = "Gaussian" P: 3 D: 1 Q: 1 Constant: 0 AR: {NaN NaN} at lags [1 2] SAR: {} MA: {NaN} at lag [1] SMA: {} Seasonality: 0 Beta: [1×0] Variance: NaN

`Mdl`

is an ARIMA(2,1,1) model. `estimate`

changes this designation to ARIMAX(2,1,1) when you specify the exogenous predictors data by using the `'X'`

name-value pair argument. `estimate`

fits all estimable parameters, properties having value `NaN`

in `Mdl`

, to the data.

Fit the ARIMAX(2,1,1) model to `y`

including regression matrix `Xmat`

.

`EstMdl = estimate(Mdl,y,'X',Xmat);`

ARIMAX(2,1,1) Model (Gaussian Distribution): Value StandardError TStatistic PValue ________ _____________ __________ __________ Constant 0 0 NaN NaN AR{1} 0.41634 0.046067 9.0376 1.601e-19 AR{2} -0.27405 0.040645 -6.7427 1.5552e-11 MA{1} 0.3346 0.057208 5.8488 4.9499e-09 Beta(1) 1.4194 0.14242 9.9662 2.1429e-23 Beta(2) 2.542 0.1331 19.098 2.6194e-81 Beta(3) -0.28767 0.14035 -2.0496 0.040399 Variance 0.096777 0.005791 16.712 1.08e-62

`EstMdl`

is a new `arima`

model designated as ARIMAX(2,1,1) since exogenous predictors enter the model. The estimates in `EstMdl`

resemble the parameter values that generated the simulated data.

[1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. *Time Series Analysis: Forecasting and Control* 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Enders, W. *Applied Econometric Time Series*. Hoboken, NJ: John Wiley & Sons, 1995.

[3] Greene, W. H. *Econometric Analysis.* 3rd ed. Upper Saddle River, NJ: Prentice Hall, 1997.

[4] Hamilton, J. D. *Time Series Analysis*. Princeton, NJ: Princeton University Press, 1994.

- Estimate Multiplicative ARIMA Model
- Estimate Conditional Mean and Variance Model
- Model Seasonal Lag Effects Using Indicator Variables
- Maximum Likelihood Estimation for Conditional Mean Models
- Conditional Mean Model Estimation with Equality Constraints
- Presample Data for Conditional Mean Model Estimation
- Initial Values for Conditional Mean Model Estimation
- Optimization Settings for Conditional Mean Model Estimation

A modified version of this example exists on your system. Do you want to open this version instead?

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)