random

Generate random responses from fitted generalized linear mixed-effects model

Syntax

ysim = random(glme)

ysim = random(glme,tblnew)

ysim = random(___,Name,Value)

Description

ysim = random(glme) returns simulated responses, ysim, from the fitted generalized linear mixed-effects model glme, at the original design points.

example

ysim = random(glme,tblnew) returns simulated responses using new input values specified in the table or dataset array, tblnew.

ysim = random(___,Name,Value) returns simulated responses using additional options specified by one or more Name,Value pair arguments, using any of the previous syntaxes. For example, you can specify observation weights, binomial sizes, or offsets for the model.

Input Arguments

expand all

`glme` — Generalized linear mixed-effects model
`GeneralizedLinearMixedModel` object

Generalized linear mixed-effects model, specified as a GeneralizedLinearMixedModel object. For properties and methods of this object, see GeneralizedLinearMixedModel.

`tblnew` — New input data
table | dataset array

New input data, which includes the response variable, predictor variables, and grouping variables, specified as a table or dataset array. The predictor variables can be continuous or grouping variables. tblnew must contain the same variables as the original table or dataset array, tbl, used to fit the generalized linear mixed-effects model glme.

Name-Value Arguments

expand all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

`BinomialSize` — Number of trials for binomial distribution
`ones(m,1)` (default) | m-by-1 vector of positive integer values

Number of trials for binomial distribution, specified as the comma-separated pair consisting of 'BinomialSize' and an m-by-1 vector of positive integer values, where m is the number of rows in tblnew. The 'BinomialSize' name-value pair applies only to the binomial distribution. The value specifies the number of binomial trials when generating the random response values.

Data Types: single | double

`Offset` — Model offset
`zeros(m,1)` (default) | vector of scalar values

Model offset, specified as a vector of scalar values of length m, where m is the number of rows in tblnew. The offset is used as an additional predictor and has a coefficient value fixed at 1.

`Weights` — Observation weights
m-by-1 vector of nonnegative scalar values

Observation weights, specified as the comma-separated pair consisting of 'Weights' and an m-by-1 vector of nonnegative scalar values, where m is the number of rows in tblnew. If the response distribution is binomial or Poisson, then 'Weights' must be a vector of positive integers.

Data Types: single | double

Output Arguments

expand all

`ysim` — Simulated response values
m-by-1 vector

Simulated response values, returned as an m-by-1 vector, where m is the number of rows in tblnew. random creates ysim by first generating the random-effects vector based on its fitted prior distribution. random then generates ysim from its fitted conditional distribution given the random effects. random takes into account the effect of observation weights specified when fitting the model using fitglme, if any.

Examples

expand all

Simulate Random Responses from a GLME Model

Open Live Script

Load the sample data.

load mfr

This simulated data is from a manufacturing company that operates 50 factories across the world, with each factory running a batch process to create a finished product. The company wants to decrease the number of defects in each batch, so it developed a new manufacturing process. To test the effectiveness of the new process, the company selected 20 of its factories at random to participate in an experiment: Ten factories implemented the new process, while the other ten continued to run the old process. In each of the 20 factories, the company ran five batches (for a total of 100 batches) and recorded the following data:

Flag to indicate whether the batch used the new process (newprocess)
Processing time for each batch, in hours (time)
Temperature of the batch, in degrees Celsius (temp)
Categorical variable indicating the supplier (A, B, or C) of the chemical used in the batch (supplier)
Number of defects in the batch (defects)

The data also includes time_dev and temp_dev, which represent the absolute deviation of time and temperature, respectively, from the process standard of 3 hours at 20 degrees Celsius.

Fit a generalized linear mixed-effects model using newprocess, time_dev, temp_dev, and supplier as fixed-effects predictors. Include a random-effects term for intercept grouped by factory, to account for quality differences that might exist due to factory-specific variations. The response variable defects has a Poisson distribution, and the appropriate link function for this model is log. Use the Laplace fit method to estimate the coefficients. Specify the dummy variable encoding as 'effects', so the dummy variable coefficients sum to 0.

The number of defects can be modeled using a Poisson distribution

${defects}_{i j} \sim Poisson (μ_{i j})$

This corresponds to the generalized linear mixed-effects model

$\log (μ_{i j}) = β_{0} + β_{1} {newprocess}_{i j} + β_{2} {time_dev}_{i j} + β_{3} {temp_dev}_{i j} + β_{4} {supplier_C}_{i j} + β_{5} {supplier_B}_{i j} + b_{i},$

where

${defects}_{i j}$ is the number of defects observed in the batch produced by factory $i$ during batch $j$ .
$μ_{i j}$ is the mean number of defects corresponding to factory $i$ (where $i = 1, 2, . . ., 20$ ) during batch $j$ (where $j = 1, 2, . . ., 5$ ).
${newprocess}_{i j}$ , ${time_dev}_{i j}$ , and ${temp_dev}_{i j}$ are the measurements for each variable that correspond to factory $i$ during batch $j$ . For example, ${newprocess}_{i j}$ indicates whether the batch produced by factory $i$ during batch $j$ used the new process.
${supplier_C}_{i j}$ and ${supplier_B}_{i j}$ are dummy variables that use effects (sum-to-zero) coding to indicate whether company C or B, respectively, supplied the process chemicals for the batch produced by factory $i$ during batch $j$ .
$b_{i} \sim N (0, σ_{b}^{2})$ is a random-effects intercept for each factory $i$ that accounts for factory-specific variation in quality.

glme = fitglme(mfr,'defects ~ 1 + newprocess + time_dev + temp_dev + supplier + (1|factory)','Distribution','Poisson','Link','log','FitMethod','Laplace','DummyVarCoding','effects');

Use random to simulate a new response vector from the fitted model.

rng(0,'twister');  % For reproducibility
ynew = random(glme);

Display the first 10 rows of the simulated response vector.

ynew(1:10)

Simulate a new response vector using new input values. Create a new table by copying the first 10 rows of mfr into tblnew.

tblnew = mfr(1:10,:);

The first 10 rows of mfr include data collected from trials 1 through 5 for factories 1 and 2. Both factories used the old process for all of their trials during the experiment, so newprocess = 0 for all 10 observations.

Change the value of newprocess to 1 for the observations in tblnew.

tblnew.newprocess = ones(height(tblnew),1);

Simulate new responses using the new input values in tblnew.

ynew2 = random(glme,tblnew)

ynew2 = 10×1

     2
     3
     5
     4
     2
     2
     2
     1
     2
     0

More About

expand all

Conditional Distribution Method

random generates random data from the fitted generalized linear mixed-effects model as follows:

Sample $b_{s i m} \sim P (b | \hat{θ}, {\hat{σ}}^{2})$ , where $P (b | \hat{θ}, {\hat{σ}}^{2})$ is the estimated prior distribution of random effects, and $\hat{θ}$ is a vector of estimated covariance parameters, and ${\hat{σ}}^{2}$ is the estimated dispersion parameter.
Given b_sim, for i = 1 to m, sample $y_{s i m_i} \sim P (y_{n e w_i} | b_{s i m}, \hat{β}, \hat{θ}, {\hat{σ}}^{2})$ , where $P (y_{n e w_i} | b_{s i m}, \hat{β}, \hat{θ}, {\hat{σ}}^{2})$ is the conditional distribution of the ith new response y_{new_i} given b_sim and the model parameters.

random

Syntax

Description

Input Arguments

glme — Generalized linear mixed-effects model GeneralizedLinearMixedModel object

tblnew — New input data table | dataset array

Name-Value Arguments

BinomialSize — Number of trials for binomial distribution ones(m,1) (default) | m-by-1 vector of positive integer values

Offset — Model offset zeros(m,1) (default) | vector of scalar values

Weights — Observation weights m-by-1 vector of nonnegative scalar values

Output Arguments

ysim — Simulated response values m-by-1 vector

Examples

Simulate Random Responses from a GLME Model

More About

Conditional Distribution Method

See Also

`glme` — Generalized linear mixed-effects model
`GeneralizedLinearMixedModel` object

`tblnew` — New input data
table | dataset array

`BinomialSize` — Number of trials for binomial distribution
`ones(m,1)` (default) | m-by-1 vector of positive integer values

`Offset` — Model offset
`zeros(m,1)` (default) | vector of scalar values

`Weights` — Observation weights
m-by-1 vector of nonnegative scalar values

`ysim` — Simulated response values
m-by-1 vector