lratiotest

Likelihood ratio test of model specification

Syntax

  • h = lratiotest(uLogL,rLogL,dof) example
  • h = lratiotest(uLogL,rLogL,dof,alpha) example
  • [h,pValue] = lratiotest(___) example
  • [h,pValue,stat,cValue] = lratiotest(___) example

Description

example

h = lratiotest(uLogL,rLogL,dof) returns a logical value (h) with the rejection decision from conducting a likelihood ratio test of model specification.

lratiotest constructs the test statistic using the loglikelihood objective function evaluated at the unrestricted model parameter estimates (uLogL) and the restricted model parameter estimates (rLogL). The test statistic distribution has dof degrees of freedom.

  • If uLogL or rLogL is a vector, then the other must be a scalar or vector of equal length. lratiotest(uLogL,rLogL,dof) treats each element of a vector input as a separate test, and returns a vector of rejection decisions.

  • If uLogL or rLogL is a row vector, then lratiotest(uLogL,rLogL,dof) returns a row vector.

example

h = lratiotest(uLogL,rLogL,dof,alpha) returns the rejection decision of the likelihood ratio test conducted at significance level alpha.

example

[h,pValue] = lratiotest(___) returns the rejection decision and p-value (pValue) for the hypothesis test, using any of the input arguments in the previous syntaxes.

example

[h,pValue,stat,cValue] = lratiotest(___) additionally returns the test statistic (stat) and critical value (cValue) for the hypothesis test.

Examples

expand all

Assess Model Specifications Using the Likelihood Ratio Test

Compare two model specifications for simulated education and income data. The unrestricted model has the following loglikelihood:

$$l(\beta,\rho) = -n\log\Gamma(\rho) + \rho\sum^n_{k=1}\log\beta_k + (\rho-1)\sum^n_{k=1}\log(y_k)-\sum^n_{k=1}y_k\beta_k,$$

where

  • $\beta_k=\frac{1}{\beta+x_k}.$

  • $x_k$ is the number of grades that person k completed.

  • $y_k$ is the income (in thousands of USD) of person k.

That is, the income of person k given the number of grades that person k completed is Gamma distributed with shape $\rho$ and rate $\beta_k$. The restricted model sets $\rho = 1$, which implies that the income of person k given the number of grades person k completed is exponentially distributed with mean $\beta + x_k$.

The restricted model is $H_0: \rho = 1$. Comparing this model to the unrestricted model using lratiotest requires the following:

  • The loglikelihood function

  • The maximum likelihood estimate (MLE) under the unrestricted model

  • The MLE under the restricted model

Load the data.

load Data_Income1
x = DataTable.EDU;
y = DataTable.INC;

To estimate the unrestricted model parameters, maximize $l(\rho,\beta)$ with respect to $\rho$ and $\beta$. The gradient of $l(\rho,\beta)$ is

$$\frac{\partial l(\rho,\beta)}{\partial \rho} = -n\psi(\rho) + \sum^n_{k=1}\log(y_k\beta_k)$$

$$\frac{\partial l(\rho,\beta)}{\partial \beta} = \sum^n_{k=1}\beta_k(\beta_k y_k-\rho),$$

where $\psi(\rho)$ is the digamma function.

nLogLGradFun = @(theta) deal(-sum(-gammaln(theta(1)) - ...
    theta(1)*log(theta(2) + x) + (theta(1)-1)*log(y) - ...
    y./(theta(2)+x)),...
    -[sum(-psi(theta(1))+log(y./(theta(2)+x)));...
    sum(1./(theta(2)+x).*(y./(theta(2)+x)-theta(1)))]);

nLogLGradFun is an anonymous function that returns the negative loglikelihood and the gradient given the input theta, which holds the parametes $\rho$ and $\beta$, respectively.

Numerically optimize the negative loglikelihood function using fmincon, which minimizes an objective function subject to constraints.

theta0 = randn(2,1); % Initial value for optimization
uLB = [0 -min(x)];   % Unrestricted model lower bound
uUB = [Inf Inf];     % Unrestricted model upper bound
options = optimoptions('fmincon','Algorithm','interior-point',...
    'TolFun',1e-10,'Display','off','GradObj','on');...
    % Optimization options

[uMLE,uLogL] = fmincon(nLogLGradFun,theta0,[],[],[],[],uLB,uUB,[],options);
uLogL = -uLogL;

uMLE is the unrestricted maximum likelihood estimate, and uLogL is the loglikelihood maximum.

Impose the restriction to the loglikelihood by setting the corresponding lower and upper bound constraints of $\rho$ to 1. Minimize the negative, restricted loglikelihood.

dof = 1;           % Number of restrictions
rLB = [1 -min(x)]; % Restricted model lower bound
rUB = [1 Inf];     % Restricted model upper bound
[rMLE,rLogL] = fmincon(nLogLGradFun,theta0,[],[],[],[],rLB,rUB,[],options);
rLogL = -rLogL;

rMLE is the unrestricted maximum likelihood estimate, and rLogL is the loglikelihood maximum.

Use the likelihood ratio test to assess whether the data provide enough evidence to favor the unrestricted model over the restricted model.

[h,pValue,stat] = lratiotest(uLogL,rLogL,dof)
h =

     1


pValue =

   8.9146e-04


stat =

   11.0404

pValue is close to 0, which indicates that there is strong evidence suggesting that the unrestricted model fits the data better than the restricted model.

Test Among Multiple Nested Model Specifications

Assess model specifications by testing down among multiple restricted models using simulated data. The true model is the ARMA(2,1)

$$y_t = 3 + 0.9y_{t-1}-0.5y_{t-2}+\varepsilon_t+0.7\varepsilon_{t-1},$$

where $\varepsilon_t$ is Gaussian with mean 0 and variance 1.

Specify the true ARMA(2,1) model, and simulate 100 response values.

TrueMdl = arima('AR',{0.9,-0.5},'MA',0.7,...
    'Constant',3,'Variance',1);
T = 100;
rng(1); % For reproducibility
y = simulate(TrueMdl,T);

Specify the unrestriced model and the candidate models for testing down.

Mdl = {arima(2,0,2),arima(2,0,1),arima(2,0,0),arima(1,0,2),arima(1,0,1),...
    arima(1,0,0),arima(0,0,2),arima(0,0,1)};
rMdlNames = {'ARMA(2,1)','AR(2)','ARMA(1,2)','ARMA(1,1)',...
    'AR(1)','MA(2)','MA(1)'};

Mdl is a 1-by-7 cell array. Mdl{1} is the unrestricted model, and all other cells contain a candidate model.

Fit the candidate models to the simulated data.

logL = zeros(size(Mdl,1),1); % Preallocate loglikelihoods
dof = logL;                  % Preallocate degress of freedom
for k = 1:size(Mdl,2)
    [EstMdl,~,logL(k)] = estimate(Mdl{k},y,'Display','off');
    dof(k) = 4 - (EstMdl.P + EstMdl.Q); % Number of restricted parameters
end
uLogL = logL(1);
rLogL = logL(2:end);
dof = dof(2:end);

uLogL and rLogL are the values of the unrestricted loglikelihood evaluated at the unrestricted and restricted model parameter estimates, respectively.

Apply the likelihood ratio test at a 1% significance level to find the appropriate, restricted model specification(s).

alpha = .01;
h = lratiotest(uLogL,rLogL,dof,alpha);
RestrictedModels = rMdlNames(~h)
RestrictedModels = 

    'ARMA(2,1)'    'ARMA(1,2)'    'ARMA(1,1)'    'MA(2)'

The most appropriate restricted models are ARMA(2,1), ARMA(1,2), ARMA(1,1), or MA(2).

You can test down again, but use ARMA(2,1) as the unrestricted model. In this case, you must remove MA(2) from the possible restricted models.

Assess Conditional Heteroscedasticity Using the Likelihood Ratio Test

Test whether there are significant ARCH effects in a simulated response series using lratiotest. The parameter values in this example are arbitrary.

Specify the AR(1) model with an ARCH(1) variance:

$$y_t = 0.9y_{t-1} + \varepsilon_t,$$

where

  • $\varepsilon_t = w_t\sqrt{h_t}.$

  • $h_t = 1 + 0.5\varepsilon_{t-1}^2.$

  • $w_t$ is Gaussian with mean 0 and variance 1.

VarMdl = garch('ARCH',0.5,'Constant',1);
Mdl = arima('Constant',0,'Variance',VarMdl,'AR',0.9);

Mdl is a fully specified AR(1) model with an ARCH(1) variance.

Simulate presample and effective sample responses from Mdl.

T = 100;
rng(1);  % For reproducibility
n = 2;   % Number of presample observations required for the gradient
[y,epsilon,condVariance] = simulate(Mdl,T + n);

psI = 1:n;             % Presample indices
esI = (n + 1):(T + n); % Estimation sample indices

epsilon is the random path of innovations from VarMdl. The software filters epsilon through Mdl to yield the random response path y.

Specify the unrestricted model assuming that the conditional mean model constant is 0:

$$y_t = \phi_1 y_{t-1} + \varepsilon_t,$$

where $h_t = \alpha_0 + \alpha_1\varepsilon_{t-1}^2$. Fit the simulated data (y) to the unrestricted model using the presample observations.

UVarMdl = garch(0,1);
UMdl = arima('ARLags',1,'Constant',0,'Variance',UVarMdl);
[~,~,uLogL] = estimate(UMdl,y(esI),'Y0',y(psI),'E0',epsilon(psI),...
    'V0',condVariance(psI),'Display','off');

uLogL is the maximimum value of the unrestricted loglikelihood function.

Specify the restricted model assuming that the conditional mean model constant is 0:

$$y_t = \phi_1 y_{t-1} + \varepsilon_t,$$

where $h_t = \alpha_0$. Fit the simulated data (y) to the restricted model using the presample observations.

RVarMdl = garch(0,1);
RVarMdl.ARCH{1} = 0;
RMdl = arima('ARLags',1,'Constant',0,'Variance',RVarMdl);
[~,~,rLogL] = estimate(RMdl,y(esI),'Y0',y(psI),'E0',epsilon(psI),...
    'V0',condVariance(psI),'Display','off');

The structure of RMdl is the same as UMdl. However, every parameter is unknown, except for the restriction. These are equality constraints during estimation. You can interpret RMdl as an AR(1) model with the Gaussian innovations that have mean 0 and constant variance.

Test the null hypothesis that $\alpha_1 = 0$ at the default 5% significance level using lratoitest.

dof = (UMdl.P + UMdl.Q + UVarMdl.P + UVarMdl.Q) ...
    - (RMdl.P + RMdl.Q + RVarMdl.P + RVarMdl.Q);
[h,pValue,stat,cValue] = lratiotest(uLogL,rLogL,dof)
h =

     1


pValue =

   6.7505e-04


stat =

   11.5567


cValue =

    3.8415

h = 1 indicates that the null, restricted model should be rejected in favor of the alternative, unrestricted model. pValue is close to 0, suggesting that there is strong evidence for the rejection. stat is the value of the chi-square test statistic, and cValue is the critical value for the test.

Input Arguments

expand all

uLogL — Unrestricted model loglikelihood maximascalar | vector

Unrestricted model loglikelihood maxima, specified as a scalar or vector. If uLogL is a scalar, then the software expands it to the same length as rLogL.

Data Types: double

rLogL — Restricted model loglikelihood maximascalar | vector

Restricted model loglikelihood maxima, specified as a scalar or vector. If rLogL is a scalar, then the software expands it to the same length as uLogL. Elements of rLogL should not exceed the corresponding elements of uLogL.

Data Types: double

dof — Degrees of freedompositive integer | vector of positive integers

Degrees of freedom for the asymptotic, chi-square distribution of the test statistics, specified as a positive integer or vector of positive integers.

For each corresponding test, the elements of dof:

  • Are the number of model restrictions

  • Should be less than the number of parameters in the unrestricted model.

When conducting k > 1 tests,

  • If dof is a scalar, then the software expands it to a k-by-1 vector.

  • If dof is a vector, then it must have length k.

Data Types: double

alpha — Nominal significance levels0.05 (default) | scalar | vector

Nominal significance levels for the hypothesis tests, specified as a scalar or vector.

Each element of alpha must be greater than 0 and less than 1.

When conducting k > 1 tests,

  • If alpha is a scalar, then the software expands it to a k-by-1 vector.

  • If alpha is a vector, then it must have length k.

Data Types: double

Output Arguments

expand all

h — Test rejection decisionslogical | vector of logicals

Test rejection decisions, returned as a logical value or vector of logical values with a length equal to the number of tests that the software conducts.

  • h = 1 indicates rejection of the null, restricted model in favor of the alternative, unrestricted model.

  • h = 0 indicates failure to reject the null, restricted model.

pValue — Test statistic p-valuesscalar | vector

Test statistic p-values, returned as a scalar or vector with a length equal to the number of tests that the software conducts.

stat — Test statisticsscalar | vector

Test statistics, returned as a scalar or vector with a length equal to the number of tests that the software conducts.

cValue — Critical valuesscalar | vector

Critical values determined by alpha, returned as a scalar or vector with a length equal to the number of tests that the software conducts.

More About

expand all

Likelihood Ratio Test

The likelihood ratio test compares specifications of nested models by assessing the significance of restrictions to an extended model with unrestricted parameters.

The test uses the following algorithm:

  1. Maximize the loglikelihood function [l(θ)] under the restricted and unrestricted model assumptions. Denote the MLEs for the restricted and unrestricted models θ^0 and θ^, respectively.

  2. Evaluate the loglikelihood objective function at the restricted and unrestricted MLEs, i.e., l^0=l(θ^0) and l^=l(θ^).

  3. Compute the likelihood ratio test statistic, LR=2(l^l^0).

  4. If LR exceeds a critical value (Cα) relative to its asymptotic distribution, then reject the null, restricted model in favor of the alternative, unrestricted model.

    • Under the null hypothesis, LR is χd2 distributed with d degrees of freedom.

    • The degrees of freedom for the test (d) is the number of restricted parameters.

    • The significance level of the test (α) determines the critical value (Cα).

Tips

  • Estimate unrestricted and restricted univariate linear time series models, such as arima or garch, or time series regression models (regARIMA) using estimate. Estimate unrestricted and restricted multivariate linear time series models using vgxvarx.

    estimate and vgxvarx return loglikelihood maxima, which you can use as inputs to lratiotest.

  • If you can easily compute both restricted and unrestricted parameter estimates, then use lratiotest. By comparison:

    • waldtest only requires unrestricted parameter estimates.

    • lmtest requires restricted parameter estimates.

Algorithms

  • lratiotest performs multiple, independent tests when the unrestricted or restricted model loglikelihood maxima (uLogL and rLogL, respectively) is a vector.

    • If rLogL is a vector and uLogL is a scalar, then lratiotest "tests down" against multiple restricted models.

    • If uLogL is a vector and rLogL is a scalar, then lratiotest "tests up" against multiple unrestricted models.

    • Otherwise, lratiotest compares model specifications pair-wise.

  • alpha is nominal in that it specifies a rejection probability in the asymptotic distribution. The actual rejection probability is generally greater than the nominal significance.

References

[1] Davidson, R. and J. G. MacKinnon. Econometric Theory and Methods. Oxford, UK: Oxford University Press, 2004.

[2] Godfrey, L. G. Misspecification Tests in Econometrics. Cambridge, UK: Cambridge University Press, 1997.

[3] Greene, W. H. Econometric Analysis. 6th ed. Upper Saddle River, NJ: Pearson Prentice Hall, 2008.

[4] Hamilton, J. D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

Was this topic helpful?