Documentation |
results = compare(lme,altlme) returns the results of a likelihood ratio test that compares the linear mixed-effects models lme and altlme. Both models must use the same response vector in the fit and lme must be nested in altlme for a valid theoretical likelihood ratio test. Always input the smaller model first, and the larger model second.
compare tests the following null and alternate hypotheses:
H_{0}: Observed response vector is generated by lme.
H_{1}: Observed response vector is generated by model altlme.
It is recommended that you fit lme and altlme using the maximum likelihood (ML) method prior to model comparison. If you use the restricted maximum likelihood (REML) method, then both models must have the same fixed-effects design matrix.
To test for fixed effects, use compare with the simulated likelihood ratio test when lme and altlme are fit using ML or use the fixedEffects, anova, or coefTest methods.
results = compare(___,Name,Value) also returns the results of a likelihood ratio test that compares linear mixed-effects models lme and altlme with additional options specified by one or more Name,Value pair arguments.
For example, you can check if the first input model is nested in the second input model.
[results,siminfo] = compare(lme,altlme,'NSim',nsim) returns the results of a simulated likelihood ratio test that compares linear mixed-effects models lme and altlme.
You can fit lme and altlme using ML or REML. Also, lme does not have to be nested in altlme. If you use the restricted maximum likelihood (REML) method to fit the models, then both models must have the same fixed-effects design matrix.
[results,siminfo] = compare(___,Name,Value) also returns the results of a simulated likelihood ratio test that compares linear mixed-effects models lme and altlme with additional options specified by one or more Name,Value pair arguments.
For example, you can change the options for performing the simulated likelihood ratio test, or change the confidence level of the confidence interval for the p-value.
Under the null hypothesis H_{0}, the observed likelihood ratio test statistic has an approximate chi-squared reference distribution with degrees of freedom deltaDF. When comparing two models, compare computes the p-value for the likelihood ratio test by comparing the observed likelihood ratio test statistic with this chi-squared reference distribution.
The p-values obtained using the likelihood ratio test can be conservative when testing for the presence or absence of random-effects terms and anticonservative when testing for the presence or absence of fixed-effects terms. Hence, use the fixedEffects, anova, or coefTest method or the simulated likelihood ratio test while testing for fixed effects.
To perform the simulated likelihood ratio test, compare first generates the reference distribution of the likelihood ratio test statistic under the null hypothesis. Then, it assesses the statistical significance of the alternate model by comparing the observed likelihood ratio test statistic to this reference distribution.
compare produces the simulated reference distribution of the likelihood ratio test statistic under the null hypothesis as follows:
Generate random data ysim from the fitted model lme.
Fit the model specified in lme and alternate model altlme to the simulated data ysim.
Calculate the likelihood ratio test statistic using results from step 2 and store the value.
Repeat step 1 to 3 nsim times.
Then, compare computes the p-value for the simulated likelihood ratio test by comparing the observed likelihood ratio test statistic with the simulated reference distribution. The p-value estimate is the ratio of the number of times the simulated likelihood ratio test statistic is equal to or exceeds the observed value plus one, to the number of replications plus one.
Suppose the observed likelihood ratio statistic is T, and the simulated reference distribution is stored in vector T_{H0}. Then,
$$p-value=\frac{\left[{\displaystyle \sum _{j=1}^{nsim}I\left({T}_{{H}_{0}}\left(j\right)\ge T\right)}\right]+1}{nsim+1}.$$
To account for the uncertainty in the simulated reference distribution, compare computes a 100*(1 – α)% confidence interval for the true p-value.
You can use the simulated likelihood ratio test to compare arbitrary linear mixed-effects models. That is, when you are using the simulated likelihood ratio test, lme does not have to be nested within altlme, and you can fit lme and altlme using either maximum likelihood (ML) or restricted maximum likelihood (REML) methods. If you use the restricted maximum likelihood (REML) method to fit the models, then both models must have the same fixed-effects design matrix.
The 'CheckNesting','True' name-value pair argument checks the following requirements.
For a simulated likelihood ratio test:
You must use the same method to fit both models (lme and altlme). compare cannot compare a model fit using ML to a model fit using REML.
You must fit both models to the same response vector.
If you use REML to fit lme and altlme, then both models must have the same fixed-effects design matrix.
The maximized log likelihood or restricted log likelihood of the bigger model (altlme) must be greater than or equal to that of the smaller model (lme).
For a theoretical test, 'CheckNesting','True' checks all the requirements listed for a simulated likelihood ratio test and the following:
Weight vectors you use to fit lme and altlme must be identical.
If you use ML to fit lme and altlme, the fixed-effects design matrix of the bigger model (altlme) must contain that of the smaller model (lme).
The random-effects design matrix of the bigger model (altlme) must contain that of the smaller model (lme).
Akaike information criterion (AIC) is AIC = –2*logL_{M} + 2*(nc + p + 1), where logL_{M} is the maximized log likelihood (or maximized restricted log likelihood) of the model, and nc + p + 1 is the number of parameters estimated in the model. p is the number of fixed-effects coefficients, and nc is the total number of parameters in the random-effects covariance excluding the residual variance.
Bayesian information criterion (BIC) is BIC = –2*logL_{M} + ln(n_{eff})*(nc + p + 1), where logL_{M} is the maximized log likelihood (or maximized restricted log likelihood) of the model, n_{eff} is the effective number of observations, and (nc + p + 1) is the number of parameters estimated in the model.
If the fitting method is maximum likelihood (ML), then n_{eff} = n, where n is the number of observations.
If the fitting method is restricted maximum likelihood (REML), then n_{eff} = n – p.
A lower value of deviance indicates a better fit. As the value of deviance decreases, both AIC and BIC tend to decrease. Both AIC and BIC also include penalty terms based on the number of parameters estimated, p. So, when the number of parameters increase, the values of AIC and BIC tend to increase as well. When comparing different models, the model with the lowest AIC or BIC value is considered as the best fitting model.
LinearMixedModel computes the deviance of model M as minus two times the loglikelihood of that model. Let L_{M} denote the maximum value of the likelihood function for model M. Then, the deviance of model M is
$$-2*\mathrm{log}{L}_{M}.$$
A lower value of deviance indicates a better fit. Suppose M_{1} and M_{2} are two different models, where M_{1} is nested in M_{2}. Then, the fit of the models can be assessed by comparing the deviances Dev_{1} and Dev_{2} of these models. The difference of the deviances is
$$Dev=De{v}_{1}-De{v}_{2}=2\left(\mathrm{log}L{M}_{2}-\mathrm{log}L{M}_{1}\right).$$
Usually, the asymptotic distribution of this difference has a chi-square distribution with degrees of freedom v equal to the number of parameters that are estimated in one model but fixed (typically at 0) in the other. That is, it is equal to the difference in the number of parameters estimated in M_{1} and M_{2}. You can get the p-value for this test using 1 – chi2cdf(Dev,V), where Dev = Dev_{2} – Dev_{1}.
However, in mixed-effects models, when some variance components fall on the boundary of the parameter space, the asymptotic distribution of this difference is more complicated. For example, consider the hypotheses
H_{0}: $$D=\left(\begin{array}{cc}{D}_{11}& 0\\ 0& 0\end{array}\right),$$ D is a q-by-q symmetric positive semidefinite matrix.
H_{1}: D is a (q+1)-by-(q+1) symmetric positive semidefinite matrix.
That is, H_{1} states that the last row and column of D are different from zero. Here, the bigger model M_{2} has q + 1 parameters and the smaller model M_{1} has q parameters. And Dev has a 50:50 mixture of χ^{2}_{q} and χ^{2}_{(q + 1)} distributions (Stram and Lee, 1994).
[1] Hox, J. Multilevel Analysis, Techniques and Applications. Lawrence Erlbaum Associates, Inc., 2002.
[2] Stram D. O. and J. W. Lee. "Variance components testing in the longitudinal mixed-effects model". Biometrics, Vol. 50, 4, 1994, pp. 1171–1177.
anova | covarianceParameters | fitlme | fitlmematrix | fixedEffects | LinearMixedModel | randomEffects