Documentation |
Akaike information criterion (AIC) is AIC = –2*logL_{M} + 2*(nc + p + 1), where logL_{M} is the maximized log likelihood (or maximized restricted log likelihood) of the model, and nc + p + 1 is the number of parameters estimated in the model. p is the number of fixed-effects coefficients, and nc is the total number of parameters in the random-effects covariance excluding the residual variance.
Bayesian information criterion (BIC) is BIC = –2*logL_{M} + ln(n_{eff})*(nc + p + 1), where logL_{M} is the maximized log likelihood (or maximized restricted log likelihood) of the model, n_{eff} is the effective number of observations, and (nc + p + 1) is the number of parameters estimated in the model.
If the fitting method is maximum likelihood (ML), then n_{eff} = n, where n is the number of observations.
If the fitting method is restricted maximum likelihood (REML), then n_{eff} = n – p.
A lower value of deviance indicates a better fit. As the value of deviance decreases, both AIC and BIC tend to decrease. Both AIC and BIC also include penalty terms based on the number of parameters estimated, p. So, when the number of parameters increase, the values of AIC and BIC tend to increase as well. When comparing different models, the model with the lowest AIC or BIC value is considered as the best fitting model.
LinearMixedModel computes the deviance of model M as minus two times the loglikelihood of that model. Let L_{M} denote the maximum value of the likelihood function for model M. Then, the deviance of model M is
$$-2*\mathrm{log}{L}_{M}.$$
A lower value of deviance indicates a better fit. Suppose M_{1} and M_{2} are two different models, where M_{1} is nested in M_{2}. Then, the fit of the models can be assessed by comparing the deviances Dev_{1} and Dev_{2} of these models. The difference of the deviances is
$$Dev=De{v}_{1}-De{v}_{2}=2\left(\mathrm{log}L{M}_{2}-\mathrm{log}L{M}_{1}\right).$$
Usually, the asymptotic distribution of this difference has a chi-square distribution with degrees of freedom v equal to the number of parameters that are estimated in one model but fixed (typically at 0) in the other. That is, it is equal to the difference in the number of parameters estimated in M_{1} and M_{2}. You can get the p-value for this test using 1 – chi2cdf(Dev,V), where Dev = Dev_{2} – Dev_{1}.
However, in mixed-effects models, when some variance components fall on the boundary of the parameter space, the asymptotic distribution of this difference is more complicated. For example, consider the hypotheses
H_{0}: $$D=\left(\begin{array}{cc}{D}_{11}& 0\\ 0& 0\end{array}\right),$$ D is a q-by-q symmetric positive semidefinite matrix.
H_{1}: D is a (q+1)-by-(q+1) symmetric positive semidefinite matrix.
That is, H_{1} states that the last row and column of D are different from zero. Here, the bigger model M_{2} has q + 1 parameters and the smaller model M_{1} has q parameters. And Dev has a 50:50 mixture of χ^{2}_{q} and χ^{2}_{(q + 1)} distributions (Stram and Lee, 1994).
[1] Hox, J. Multilevel Analysis, Techniques and Applications. Lawrence Erlbaum Associates, Inc., 2002.
[2] Stram D. O. and J. W. Lee. "Variance components testing in the longitudinal mixed-effects model". Biometrics, Vol. 50, 4, 1994, pp. 1171–1177.