MathWorks Machine Translation
The automated translation of this page is provided by a general purpose third party translator tool.
MathWorks does not warrant, and disclaims all liability for, the accuracy, suitability, or fitness for purpose of the translation.
Linear mixedeffects model class
A LinearMixedModel
object represents a model
of a response variable with fixed and random effects. It comprises
data, a model description, fitted coefficients, covariance parameters,
design matrices, residuals, residual plots, and other diagnostic information
for a linear mixedeffects model. You can predict model responses
with the predict
function and generate random data
at new design points using the random
function.
You can fit a linear mixedeffects model using fitlme(tbl,formula)
if
your data is in a table or dataset array. Alternatively, if your model
is not easily described using a formula, you can create matrices to
define the fixed and random effects, and fit the model using fitlmematrix(X,y,Z,G)
.
tbl
— Input datadataset
arrayInput data, which includes the response variable, predictor
variables, and grouping variables, specified as a table or dataset
array. The predictor
variables can be continuous or grouping variables (see Grouping Variables). You
must specify the model for the variables using formula
.
Data Types: single
 double
 char
 cell
formula
— Formula for model specification'y ~ fixed + (random1grouping1) + ... + (randomRgroupingR)'
Formula for model specification, specified as a character vector
of the form 'y ~ fixed + (random1grouping1) + ... + (randomRgroupingR)'
.
For a full description, see Formula.
Example: 'y ~ treatment +(1block)'
X
— Fixedeffects design matrixFixedeffects design matrix, specified as an nbyp matrix,
where n is the number of observations, and p is
the number of fixedeffects predictor variables. Each row of X
corresponds
to one observation, and each column of X
corresponds
to one variable.
Data Types: single
 double
y
— Response valuesResponse values, specified as an nby1 vector, where n is the number of observations.
Data Types: single
 double
Z
— Randomeffects designRandomeffects design, specified as either of the following.
If there is one randomeffects term in the model,
then Z
must be an nbyq matrix,
where n is the number of observations and q is
the number of variables in the randomeffects term.
If there are R randomeffects terms,
then Z
must be a cell array of length R.
Each cell of Z
contains an nbyq(r)
design matrix Z{r}
, r = 1, 2,
..., R, corresponding to each randomeffects term.
Here, q(r) is the number of
random effects term in the rth random effects design
matrix, Z{r}
.
Data Types: single
 double
 cell
G
— Grouping variable or variablesGrouping variable or variables, specified as either of the following.
If there is one randomeffects term, then G
must
be an nby1 vector corresponding to a single grouping
variable with M levels or groups.
G
can be a categorical vector, numeric vector,
character array, or cell array of character vectors.
If there are multiple randomeffects terms, then G
must
be a cell array of length R. Each cell of G
contains
a grouping variable G{r}
, r =
1, 2, ..., R, with M(r)
levels.
G{r}
can be a categorical vector, numeric
vector, character array, or cell array of character vectors.
Data Types: single
 double
 char
 cell
Coefficients
— Fixedeffects coefficient estimatesFixedeffects coefficient estimates and related statistics, stored as a dataset array containing the following fields.
Name  Name of the term. 
Estimate  Estimated value of the coefficient. 
SE  Standard error of the coefficient. 
tStat  tstatistics for testing the null hypothesis that the coefficient is equal to zero. 
DF  Degrees of freedom for the ttest. Method
to compute DF is specified by the 'DFMethod' namevalue
pair argument. Coefficients always uses the 'Residual' method
for 'DFMethod' . 
pValue  pvalue for the ttest. 
Lower  Lower limit of the confidence interval for coefficient. Coefficients always
uses the 95% confidence level, i.e.'alpha' is 0.05. 
Upper  Upper limit of confidence interval for coefficient. Coefficients always
uses the 95% confidence level, i.e.'alpha' is 0.05. 
You can change 'DFMethod'
and 'alpha'
while
computing confidence intervals for or testing hypotheses involving
fixed and randomeffects, using the coefCI
and coefTest
methods.
CoefficientCovariance
— Covariance of the estimated fixedeffects coefficientsCovariance of the estimated fixedeffects coefficients of the linear mixedeffects model, stored as a pbyp matrix, where p is the number of fixedeffects coefficients.
You can display the covariance parameters associated with the
random effects using the covarianceParameters
method.
Data Types: double
CoefficientNames
— Names of the fixedeffects coefficientsNames of the fixedeffects coefficients of a linear mixedeffects model, stored as a 1byp cell array of character vectors.
Data Types: cell
DFE
— Residual degrees of freedomResidual degrees of freedom, stored as a positive integer value. DFE = n – p, where n is the number of observations, and p is the number of fixedeffects coefficients.
This corresponds to the 'Residual'
method
of calculating degrees of freedom in the fixedEffects
and randomEffects
methods.
Data Types: double
FitMethod
— Method used to fit the linear mixedeffects modelML
 REML
Method used to fit the linear mixedeffects model, stored as either of the following.
ML
, if the fitting method is maximum
likelihood
REML
, if the fitting method is
restricted maximum likelihood
Data Types: char
Formula
— Specification of the fixed and randomeffects terms, and grouping variablesSpecification of the fixedeffects terms, randomeffects terms, and grouping variables that define the linear mixedeffects model, stored as an object.
For more information on how to specify the model to fit using a formula, see Formula.
LogLikelihood
— Maximized log or restricted log likelihoodMaximized log likelihood or maximized restricted log likelihood of the fitted linear mixedeffects model depending on the fitting method you choose, stored as a scalar value.
Data Types: double
ModelCriterion
— Model criterionModel criterion to compare fitted linear mixedeffects models, stored as a dataset array with the following columns.
AIC  Akaike Information Criterion 
BIC  Bayesian Information Criterion 
Loglikelihood  Log likelihood value of the model 
Deviance  –2 times the log likelihood of the model 
If n is the number of observations used in fitting the model, and p is the number of fixedeffects coefficients, then for calculating AIC and BIC,
The total number of parameters is nc + p + 1, where nc is the total number of parameters in the randomeffects covariance excluding the residual variance
The effective number of observations is
n, when the fitting method is maximum likelihood (ML)
n – p, when the fitting method is restricted maximum likelihood (REML)
MSE
— ML or REML estimateML or REML estimate, based on the fitting method used for estimating σ^{2}, stored as a positive scalar value. σ^{2} is the residual variance or variance of the observation error term of the linear mixedeffects model.
Data Types: double
NumCoefficients
— Number of fixedeffects coefficientsNumber of fixedeffects coefficients in the fitted linear mixedeffects model, stored as a positive integer value.
Data Types: double
NumEstimatedCoefficients
— Number of estimated fixedeffects coefficientsNumber of estimated fixedeffects coefficients in the fitted linear mixedeffects model, stored as a positive integer value.
Data Types: double
NumObservations
— Number of observationsNumber of observations used in the fit, stored as a positive
integer value. This is the number of rows in the table or dataset
array, or the design matrices minus the excluded rows or rows with NaN
values.
Data Types: double
NumPredictors
— Number of predictorsNumber of variables used as predictors in the linear mixedeffects model, stored as a positive integer value.
Data Types: double
NumVariables
— Total number of variablesTotal number of variables including the response and predictors, stored as a positive integer value.
If the sample data is in a table or dataset array tbl
, NumVariables
is
the total number of variables in tbl
including
the response variable.
If the fit is based on matrix input, NumVariables
is
the total number of columns in the predictor matrix or matrices, and
response vector.
NumVariables
includes variables, if there
are any, that are not used as predictors or as the response.
Data Types: double
ObservationInfo
— Information about the observationsInformation about the observations used in the fit, stored as a table.
ObservationInfo
has one row for each observation
and the following four columns.
Weights  The value of the weighted variable for that observation. Default value is 1. 
Excluded  true , if the observation was excluded from
the fit using the 'Exclude' namevalue pair argument, false ,
otherwise. 1 stands for true and 0 stands for false . 
Missing  true , if the observation was excluded from
the fit because any response or predictor value is missing, false ,
otherwise. Missing values include 
Subset  true , if the observation was used in the
fit, false , if it was not used because it is missing
or excluded. 
Data Types: table
ObservationNames
— Names of observationsNames of observations used in the fit, stored as a cell array of character vectors.
If the data is in a table or dataset array, tbl
,
containing observation names, ObservationNames
has
those names.
If the data is provided in matrices, or a table or
dataset array without observation names, then ObservationNames
is
an empty cell array.
Data Types: cell
PredictorNames
— Names of predictorsNames of the variables that you use as predictors in the fit,
stored as a cell array of character vectors that has the same length
as NumPredictors
.
Data Types: cell
ResponseName
— Names of response variableName of the variable used as the response variable in the fit, stored as a character vector.
Data Types: char
Rsquared
— Proportion of variability in the response explained by the fitted modelProportion of variability in the response explained by the fitted
model, stored as a structure. It is the multiple correlation coefficient
or Rsquared. Rsquared
has two fields.
Ordinary  Rsquared value, stored as a scalar value in a structure. Rsquared.Ordinary
= 1 – SSE./SST 
Adjusted  Rsquared value adjusted for the number of fixedeffects coefficients,
stored as a scalar value in a structure.
where 
Data Types: struct
SSE
— Error sum of squaresError sum of squares, that is, sum of the squared conditional residuals, stored as a positive scalar value.
SSE = sum((y – F).^2)
, where y
is
the response vector, and F
is the fitted conditional
response of the linear mixedeffects model. The conditional model
has contributions from both fixed and random effects.
Data Types: double
SSR
— Regression sum of squaresRegression sum of squares, that is, the sum of squares explained by the linear mixedeffects regression, stored as a positive scalar value. It is the sum of squared deviations of the conditional fitted values from their mean.
SSR = sum((F – mean(F)).^2)
, where F
is
the fitted conditional response of the linear mixedeffects model.
The conditional model has contributions from both fixed and random
effects.
Data Types: double
SST
— Total sum of squaresTotal sum of squares, that is, the sum of the squared deviations of the observed response values from their mean, stored as a positive scalar value.
SST = sum((y – mean(y)).^2) = SSR + SSE
,
where y
is the response vector.
Data Types: double
Variables
— VariablesVariables, stored as a table.
If the fit is based on a table or dataset array tbl
,
then Variables
is identical to tbl
.
If the fit is based on matrix input, then Variables
is
a table containing all the variables in the predictor matrix or matrices,
and response variable.
Data Types: table
VariableInfo
— Information about the variablesInformation about the variables used in the fit, stored as a table.
VariableInfo
has one row for each variable
and contains the following four columns.
Class  Class of the variable ('double' , 'cell' , 'nominal' ,
and so on). 
Range  Value range of the variable.

InModel  true , if the variable is a predictor in
the fitted model.

IsCategorical  true , if the variable has a type that is
treated as a categorical predictor, such as cell, logical, or categorical,
or if it is specified as categorical by the 'Categorical' namevalue
pair argument of the fit method.

Data Types: table
VariableNames
— Names of the variablesNames of the variables used in the fit, stored as a cell array of character vectors.
If sample data is in a table or dataset array tbl
, VariableNames
contains
the names of the variables in tbl
.
If sample data is in matrix format, then VariableInfo
includes
variable names you supply while fitting the model. If you do not supply
the variable names, then VariableInfo
contains
the default names.
Data Types: cell
anova  Analysis of variance for linear mixedeffects model 
coefCI  Confidence intervals for coefficients of linear mixedeffects model 
coefTest  Hypothesis test on fixed and random effects of linear mixedeffects model 
compare  Compare linear mixedeffects models 
covarianceParameters  Extract covariance parameters of linear mixedeffects model 
designMatrix  Fixed and randomeffects design matrices 
disp  Display linear mixedeffects model 
fit  Fit linear mixedeffects model using tables 
fitmatrix  Fit linear mixedeffects model using design matrices 
fitted  Fitted responses from a linear mixedeffects model 
fixedEffects  Estimates of fixed effects and related statistics 
plotResiduals  Plot residuals of linear mixedeffects model 
predict  Predict response of linear mixedeffects model 
random  Generate random responses from fitted linear mixedeffects model 
randomEffects  Estimates of random effects and related statistics 
residuals  Residuals of fitted linear mixedeffects model 
response  Response vector of the linear mixedeffects model 
Value. To learn how value classes affect copy operations, see Copying Objects (MATLAB) in the MATLAB^{®} documentation.
Load the sample data.
load flu
The flu
dataset array has a Date
variable, and 10 variables containing estimated influenza rates (in 9 different regions, estimated from Google® searches, plus a nationwide estimate from the Center for Disease Control and Prevention, CDC).
To fit a linearmixed effects model, your data must be in a properly formatted dataset array. To fit a linear mixedeffects model with the influenza rates as the responses and region as the predictor variable, combine the nine columns corresponding to the regions into an array. The new dataset array, flu2
, must have the response variable, FluRate
, the nominal variable, Region
, that shows which region each estimate is from, and the grouping variable Date
.
flu2 = stack(flu,2:10,'NewDataVarName','FluRate',... 'IndVarName','Region'); flu2.Date = nominal(flu2.Date);
Fit a linear mixedeffects model with fixed effects for region and a random intercept that varies by Date
.
Because region is a nominal variable, fitlme
takes the first region, NE
, as the reference and creates eight dummy variables representing the other eight regions. For example,
is the dummy variable representing the region MidAtl
. For details, see Dummy Indicator Variables.
The corresponding model is
where
is the observation
for level
of grouping variable Date
,
,
= 0, 1, ..., 8, are the fixedeffects coefficients,
is the random effect for level
of the grouping variable Date
, and
is the observation error for observation
. The random effect has the prior distribution,
and the error term has the distribution,
.
lme = fitlme(flu2,'FluRate ~ 1 + Region + (1Date)')
lme = Linear mixedeffects model fit by ML Model information: Number of observations 468 Fixed effects coefficients 9 Random effects coefficients 52 Covariance parameters 2 Formula: FluRate ~ 1 + Region + (1  Date) Model fit statistics: AIC BIC LogLikelihood Deviance 318.71 364.35 148.36 296.71 Fixed effects coefficients (95% CIs): Name Estimate SE tStat DF '(Intercept)' 1.2233 0.096678 12.654 459 'Region_MidAtl' 0.010192 0.052221 0.19518 459 'Region_ENCentral' 0.051923 0.052221 0.9943 459 'Region_WNCentral' 0.23687 0.052221 4.5359 459 'Region_SAtl' 0.075481 0.052221 1.4454 459 'Region_ESCentral' 0.33917 0.052221 6.495 459 'Region_WSCentral' 0.069 0.052221 1.3213 459 'Region_Mtn' 0.046673 0.052221 0.89377 459 'Region_Pac' 0.16013 0.052221 3.0665 459 pValue Lower Upper 1.085e31 1.0334 1.4133 0.84534 0.092429 0.11281 0.3206 0.050698 0.15454 7.3324e06 0.13424 0.33949 0.14902 0.02714 0.1781 2.1623e10 0.23655 0.44179 0.18705 0.033621 0.17162 0.37191 0.055948 0.14929 0.0022936 0.26276 0.057514 Random effects covariance parameters (95% CIs): Group: Date (52 Levels) Name1 Name2 Type Estimate Lower '(Intercept)' '(Intercept)' 'std' 0.6443 0.5297 Upper 0.78368 Group: Error Name Estimate Lower Upper 'Res Std' 0.26627 0.24878 0.285
The
values 7.3324e06 and 2.1623e10 respectively show that the fixed effects of the flu rates in regions WNCentral
and ESCentral
are significantly different relative to the flu rates in region NE
.
The confidence limits for the standard deviation of the randomeffects term,
, do not include 0 (0.5297, 0.78368), which indicates that the randomeffects term is significant. You can also test the significance of the randomeffects terms using the compare
method.
The estimated value of an observation is the sum of the fixed effects and the randomeffect value at the grouping variable level corresponding to that observation. For example, the estimated best linear unbiased predictor (BLUP) of the flu rate for region WNCentral
in week 10/9/2005 is
This is the fitted conditional response, since it includes contribution to the estimate from both the fixed and random effects. You can compute this value as follows.
beta = fixedEffects(lme); [~,~,STATS] = randomEffects(lme); % Compute the randomeffects statistics (STATS) STATS.Level = nominal(STATS.Level); y_hat = beta(1) + beta(4) + STATS.Estimate(STATS.Level=='10/9/2005')
y_hat = 1.2884
You can simply display the fitted value using the fitted
method.
F = fitted(lme); F(flu2.Date == '10/9/2005' & flu2.Region == 'WNCentral')
ans = 1.2884
Compute the fitted marginal response for region WNCentral
in week 10/9/2005.
F = fitted(lme,'Conditional',false); F(flu2.Date == '10/9/2005' & flu2.Region == 'WNCentral')
ans = 1.4602
Load the sample data.
load carbig
Fit a linear mixedeffects model for miles per gallon (MPG), with fixed effects for acceleration, horsepower and the cylinders, and uncorrelated randomeffect for intercept and acceleration grouped by the model year. This model corresponds to
with the randomeffects terms having the following prior distributions:
where represents the model year.
First, prepare the design matrices for fitting the linear mixedeffects model.
X = [ones(406,1) Acceleration Horsepower]; Z = [ones(406,1) Acceleration]; Model_Year = nominal(Model_Year); G = Model_Year;
Now, fit the model using fitlmematrix
with the defined design matrices and grouping variables. Use the 'fminunc'
optimization algorithm.
lme = fitlmematrix(X,MPG,Z,G,'FixedEffectPredictors',.... {'Intercept','Acceleration','Horsepower'},'RandomEffectPredictors',... {{'Intercept','Acceleration'}},'RandomEffectGroups',{'Model_Year'},... 'FitMethod','REML')
lme = Linear mixedeffects model fit by REML Model information: Number of observations 392 Fixed effects coefficients 3 Random effects coefficients 26 Covariance parameters 4 Formula: Linear Mixed Formula with 4 predictors. Model fit statistics: AIC BIC LogLikelihood Deviance 2202.9 2230.7 1094.5 2188.9 Fixed effects coefficients (95% CIs): Name Estimate SE tStat DF pValue 'Intercept' 50.064 2.3176 21.602 389 1.4185e68 'Acceleration' 0.57897 0.13843 4.1825 389 3.5654e05 'Horsepower' 0.16958 0.0073242 23.153 389 3.5289e75 Lower Upper 45.507 54.62 0.85112 0.30681 0.18398 0.15518 Random effects covariance parameters (95% CIs): Group: Model_Year (13 Levels) Name1 Name2 Type Estimate 'Intercept' 'Intercept' 'std' 3.72 'Acceleration' 'Intercept' 'corr' 0.8769 'Acceleration' 'Acceleration' 'std' 0.3593 Lower Upper 1.5215 9.0954 0.98274 0.33846 0.19418 0.66483 Group: Error Name Estimate Lower Upper 'Res Std' 3.6913 3.4331 3.9688
The fixed effects coefficients display includes the estimate, standard errors (SE
), and the 95% confidence interval limits (Lower
and Upper
). The
values for (pValue
) indicate that all three fixedeffects coefficients are significant.
The confidence intervals for the standard deviations and the correlation between the random effects for intercept and acceleration do not include zeros, hence they seem significant. Use the compare
method to test for the random effects.
Display the covariance matrix of the estimated fixedeffects coefficients.
lme.CoefficientCovariance
ans = 5.3711 0.2809 0.0126 0.2809 0.0192 0.0005 0.0126 0.0005 0.0001
The diagonal elements show the variances of the fixedeffects coefficient estimates. For example, the variance of the estimate of the intercept is 5.3711. Note that the standard errors of the estimates are the square roots of the variances. For example, the standard error of the intercept is 2.3176, which is sqrt(5.3711)
.
The offdiagonal elements show the correlation between the fixedeffects coefficient estimates. For example, the correlation between the intercept and acceleration is –0.2809 and the correlation between acceleration and horsepower is 0.0005.
Display the coefficient of determination for the model.
lme.Rsquared
ans = struct with fields: Ordinary: 0.7826 Adjusted: 0.7815
The adjusted value is the Rsquared value adjusted for the number of predictors in the model.
In general, a formula for model specification
is a character vector of the form 'y ~ terms'
.
For the linear mixedeffects models, this formula is in the form 'y
~ fixed + (random1grouping1) + ... + (randomRgroupingR)'
,
where fixed
and random
contain
the fixedeffects and the randomeffects terms.
Suppose a table tbl
contains the following:
A response variable, y
Predictor variables, X_{j}
,
which can be continuous or grouping variables
Grouping variables, g_{1}
, g_{2}
,
..., g_{R}
,
where the grouping variables in X_{j}
and g_{r}
can
be categorical, logical, character arrays, or cell arrays of character
vectors.
Then, in a formula of the form, 'y ~ fixed + (random_{1}g_{1})
+ ... + (random_{R}g_{R})'
,
the term fixed
corresponds to a specification of
the fixedeffects design matrix X
, random
_{1} is
a specification of the randomeffects design matrix Z
_{1} corresponding
to grouping variable g
_{1},
and similarly random
_{R} is
a specification of the randomeffects design matrix Z
_{R} corresponding
to grouping variable g
_{R}.
You can express the fixed
and random
terms
using Wilkinson notation.
Wilkinson notation describes the factors present in models. The notation relates to factors present in models, not to the multipliers (coefficients) of those factors.
Wilkinson Notation  Factors in Standard Notation 

1  Constant (intercept) term 
X^k , where k is a positive
integer  X , X^{2} ,
..., X^{k} 
X1 + X2  X1 , X2 
X1*X2  X1 , X2 , X1.*X2
(elementwise multiplication of X1 and X2) 
X1:X2  X1.*X2 only 
 X2  Do not include X2 
X1*X2 + X3  X1 , X2 , X3 , X1*X2 
X1 + X2 + X3 + X1:X2  X1 , X2 , X3 , X1*X2 
X1*X2*X3  X1:X2:X3  X1 , X2 , X3 , X1*X2 , X1*X3 , X2*X3 
X1*(X2 + X3)  X1 , X2 , X3 , X1*X2 , X1*X3 
Statistics and Machine Learning Toolbox™ notation always includes a constant term
unless you explicitly remove the term using 1
.
Here are some examples for linear mixedeffects model specification.
Examples:
Formula  Description 

'y ~ X1 + X2'  Fixed effects for the intercept, X1 and X2 .
This is equivalent to 'y ~ 1 + X1 + X2' . 
'y ~ 1 + X1 + X2'  No intercept and fixed effects for X1 and X2 .
The implicit intercept term is suppressed by including 1 . 
'y ~ 1 + (1  g1)'  Fixed effects for the intercept plus random effect for the
intercept for each level of the grouping variable g1 . 
'y ~ X1 + (1  g1)'  Random intercept model with a fixed slope. 
'y ~ X1 + (X1  g1)'  Random intercept and slope, with possible correlation between
them. This is equivalent to 'y ~ 1 + X1 + (1 + X1g1)' . 
'y ~ X1 + (1  g1) + (1 + X1  g1)'  Independent random effects terms for intercept and slope. 
'y ~ 1 + (1  g1) + (1  g2) + (1  g1:g2)'  Random intercept model with independent main effects for g1 and g2 ,
plus an independent interaction effect. 
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
You can also select a location from the following list: