# Documentation

### This is machine translation

Translated by
Mouseover text to see original. Click the button below to return to the English version of the page.

# simulate

Simulate regression coefficients and disturbance variance of Bayesian linear regression model

## Syntax

``````[BetaSim,sigma2Sim] = simulate(Mdl)``````
``````[BetaSim,sigma2Sim] = simulate(Mdl,X,y)``````
``````[BetaSim,sigma2Sim] = simulate(___,Name,Value)``````

## Description

example

``````[BetaSim,sigma2Sim] = simulate(Mdl)``` returns a random vector of regression coefficients (`BetaSim`) and a random disturbance variance (`sigma2Sim`) drawn from the Bayesian linear regression model `Mdl` of β and σ2. If `Mdl` is a joint prior model (returned by `bayeslm`), then `simulate` draws from the joint prior distribution.If `Mdl` is a joint posterior model (returned by `estimate`), then `simulate` draws from the joint posterior distributions. ```

example

``````[BetaSim,sigma2Sim] = simulate(Mdl,X,y)``` draws from the marginal posterior distributions produced or updated by incorporating the predictor data `X` and corresponding response data `y`. If `Mdl` is a joint prior model, then `simulate` produces the marginal posterior distributions by “updating” the prior model with information about the parameters that it gleans from the data.If `Mdl` is a marginal posterior model, then `simulate` updates the posteriors with information about the parameters that it gleans from the additional data. That is, the complete data likelihood is composed of the additional data `X` and `y`, and the data that created `Mdl`. `NaN`s in the data indicate missing values, which `simulate` removes using list-wise deletion.```

example

``````[BetaSim,sigma2Sim] = simulate(___,Name,Value)``` uses any of the input arguments in the previous syntaxes and additional options specified by one or more `Name,Value` pair arguments. For example, you can specify a value for one of β or σ2 to simulate from the conditional posterior distribution of one parameter given the specified value of the other parameter.```

## Examples

collapse all

Consider the multiple linear regression model that predicts U.S. real gross national product (`GNPR`) using a linear combination of industrial production index (`IPI`), total employment (`E`), and real wages (`WR`).

For all , is a series of independent Gaussian disturbances with a mean of 0 and variance .

Assume that the prior distributions are:

• . is a 4-by-1 vector of means and is a scaled 4-by-4 positive definite covariance matrix.

• . and are the shape and scale, respectively, of an inverse gamma distribution.

These assumptions and the data likelihood imply a normal-inverse-gamma conjugate model.

Load the Nelson-Plosser data set. Create variables for the response and predictor series.

```load Data_NelsonPlosser varNames = {'IPI' 'E' 'WR'}; X = DataTable{:,varNames}; y = DataTable{:,'GNPR'}; ```

Create a normal-inverse-gamma conjugate prior model for the linear regression parameters. Specify the number of predictors, `p`, and the variable names.

```p = 3; PriorMdl = bayeslm(p,'ModelType','conjugate','VarNames',varNames); ```

`Mdl` is a `conjugateblm` Bayesian linear regression model object representing the prior distribution of the regression coefficients and disturbance variance.

Simulate a set of regression coefficients and a value of the disturbance variance form the prior distribution.

```rng(1); % For reproducibility [betaSimPrior,sigma2SimPrior] = simulate(PriorMdl) ```
```betaSimPrior = -33.5917 -49.1445 -37.4492 -25.3632 sigma2SimPrior = 0.1962 ```

`betaSimPrior` is the randomly drawn 4-by-1 vector of regression coefficients corresponding to the names in `PriorMdl.VarNames`. `sigma2SimPrior` is the randomly drawn scalar disturbance variance.

Estimate the posterior distribution.

```PosteriorMdl = estimate(PriorMdl,X,y); ```
```Method: Analytic posterior distributions Number of observations: 62 Number of predictors: 4 Log marginal likelihood -259.348 | Mean Std CI95 Positive Distribution ----------------------------------------------------------------------------------- Intercept | -24.2494 8.7821 [-41.514, -6.985] 0.003 t (-24.25, 8.65^2, 68) IPI | 4.3913 0.1414 [ 4.113, 4.669] 1.000 t (4.39, 0.14^2, 68) E | 0.0011 0.0003 [ 0.000, 0.002] 1.000 t (0.00, 0.00^2, 68) WR | 2.4683 0.3490 [ 1.782, 3.154] 1.000 t (2.47, 0.34^2, 68) Sigma2 | 44.1347 7.8020 [31.427, 61.855] 1.000 IG(34.00, 0.00069) ```

`PosteriorMdl` is a `conjugateblm` Bayesian linear regression model object representing the posterior distribution of the regression coefficients and disturbance variance.

Simulate a set of regression coefficients and a value of the disturbance variance form the posterior distribution.

```[betaSimPost,sigma2SimPost] = simulate(PosteriorMdl) ```
```betaSimPost = -25.9351 4.4379 0.0012 2.4072 sigma2SimPost = 41.9575 ```

`betaSimPost` and `sigma2SimPost` are commensurate with `betaSimPrior` and `sigma2SimPrior`, but are drawn from the posterior instead.

Consider the model in Simulate Parameter Value from Prior and Posterior Distributions.

Load the data, create a conjugate prior model, and then estimate it.

```load Data_NelsonPlosser varNames = {'IPI' 'E' 'WR'}; X = DataTable{:,varNames}; y = DataTable{:,'GNPR'}; p = 3; PriorMdl = bayeslm(p,'ModelType','conjugate','VarNames',varNames); [PosteriorMdl,~,~,~,~,summaryTbl] = estimate(PriorMdl,X,y,'Display',false); ```

Although, the marginal and conditional posterior distributions of and are analytically tractable, the goal of this example is to illustrate how to implement the Gibbs sampler to reproduce known results.

Estimate the model again, but use a Gibbs sampler instead. That is, alternate between sampling from the conditional posterior distributions of the parameters. Sample 10000 times, and create variables for preallocation. Start the sampler by drawing from the conditional posterior of given .

```m = 1e4; BetaDraws = zeros(p + 1,m); sigma2Draws = zeros(1,m + 1); sigma2Draws(1) = 2; rng(1); % For reproducibility for j = 1:m BetaDraws(:,j) = simulate(PriorMdl,X,y,'Sigma2',sigma2Draws(j)); [~,sigma2Draws(j + 1)] = simulate(PriorMdl,X,y,'Beta',BetaDraws(:,j)); end sigma2Draws = sigma2Draws(2:end); % Remove initial value from MCMC sample ```

Graph trace plots of the parameters.

```figure; for j = 1:(p + 1); subplot(2,2,j); plot(BetaDraws(j,:)) ylabel('MCMC draw') xlabel('Simulation index') title(sprintf('Trace plot -- %s',PriorMdl.VarNames{j})); end figure; plot(sigma2Draws) ylabel('MCMC draw') xlabel('Simulation index') title('Trace plot -- Sigma2') ```

The MCMC samples appear to have converged and are mixing well.

Apply a burn in period of 1000 draws, and then compute the means and standard deviations of the MCMC samples. Compare them with estimates from `estimate`.

```bp = 1000; postBetaMean = mean(BetaDraws(:,(bp + 1):end),2); postSigma2Mean = mean(sigma2Draws(:,(bp + 1):end)); postBetaStd = std(BetaDraws(:,(bp + 1):end),[],2); postSigma2Std = std(sigma2Draws((bp + 1):end)); [summaryTbl(:,1:2),table([postBetaMean; postSigma2Mean],... [postBetaStd; postSigma2Std],'VariableNames',{'GibbsMean','GibbsStd'})] ```
```ans = 5x4 table Mean Std GibbsMean GibbsStd _________ __________ _________ __________ Intercept -24.249 8.7821 -24.293 8.748 IPI 4.3913 0.1414 4.3917 0.13941 E 0.0011202 0.00032931 0.0011229 0.00032875 WR 2.4683 0.34895 2.4654 0.34364 Sigma2 44.135 7.802 44.011 7.7816 ```

The estimates are very close. MCMC variations account for the differences.

Consider a Bayesian linear regression model containing a one predictor, a t distributed disturbance variance with a profiled degrees of freedom parameter . That is, let:

• .

These assumptions imply:

is vector of latent scale parameters that attributes low precision to observations that are far from the regression line. is a hyperparameter controlling the influence of on the observations.

For this problem, the Gibbs sampler is well-suited to estimate the coefficients because you can simulate the parameters of a Bayesian linear regression model conditioned on , and then simulate from its conditional distribution.

Generate responses from where and .

```rng('default'); n = 100; x = linspace(0,2,n)'; b0 = 1; b1 = 2; sigma = 0.5; e = randn(n,1); y = b0 + b1*x + sigma*e; ```

Introduce outlying responses by inflating all responses below by a factor of 3.

```y(x < 0.25) = y(x < 0.25)*3; ```

Fit a linear model to the data. Plot the data and the fitted regression line.

```Mdl = fitlm(x,y) figure; plot(Mdl); hl = legend; hold on; ```
```Mdl = Linear regression model: y ~ 1 + x1 Estimated Coefficients: Estimate SE tStat pValue ________ _______ ______ __________ (Intercept) 2.6814 0.28433 9.4304 2.0859e-15 x1 0.78974 0.24562 3.2153 0.0017653 Number of observations: 100, Error degrees of freedom: 98 Root Mean Squared Error: 1.43 R-squared: 0.0954, Adjusted R-Squared 0.0862 F-statistic vs. constant model: 10.3, p-value = 0.00177 ```

The simulated outliers appear to influence the fitted regression line.

Implement this Gibbs sampler.

1. Draw parameters from the posterior distribution of . To do this, deflate the observations by , create a diffuse prior model with two regression coefficients, draw a set of parameters from the posterior. The first regression coefficient corresponds to the intercept, so specify that `bayeslm` should not include one.

2. Compute residuals.

3. Draw values from the conditional posterior of .

Run the Gibbs sampler for 20000 iterations and apply a burn in period of 5000. Specify , preallocate for the posterior draws, initialize to a vector of ones.

```m = 20000; nu = 1; burnin = 5000; lambda = ones(n,m + 1); estBeta = zeros(2,m + 1); estSigma2 = zeros(1,m + 1); for j = 1:m yDef = y./sqrt(lambda(:,j)); xDef = [ones(n,1) x]./sqrt(lambda(:,j)); PriorMdl = bayeslm(2,'Model','diffuse','Intercept',false); [estBeta(:,j + 1),estSigma2(1,j + 1)] = simulate(PriorMdl,xDef,yDef); ep = y - [ones(n,1) x]*estBeta(:,j + 1); sp = (nu + 1)/2; sc = 2./(nu + ep.^2/estSigma2(1,j + 1)); lambda(:,j + 1) = 1./gamrnd(sp,sc); end ```

It is good practice to diagnose the MCMC sampler by viewing trace plot. This example skips this important task for brevity.

Compute the mean of the draws from the posterior of the regression coefficients. Remove the burn-in-period draws.

```postEstBeta = mean(estBeta(:,(burnin+1):end),2) ```
```postEstBeta = 1.3971 1.7051 ```

The estimate of the intercept is lower, and the slope is higher, than estimates returned by `fitlm`.

Plot the robust regression line with the regression line fitted by least squares.

```h = gca; xlim = h.XLim'; plotY = [ones(2,1) xlim]*postEstBeta; plot(xlim,plotY,'LineWidth',2); hl.String{4} = 'Robust Bayes'; ```

The regression line fit using robust Bayesian regression appears to be a better fit.

The maximum a posteriori probability (MAP) estimate is the posterior mode, that is, the parameter value that yields the maximum of the posterior probability distribution function. If the posterior is analytically intractable, then you can use Monte Carlo sampling to estimate the MAP.

This example is based on Simulate Parameter Value from Prior and Posterior Distributions.

Load the Nelson-Plosser data set. Create variables for the response and predictor series.

```load Data_NelsonPlosser varNames = {'IPI' 'E' 'WR'}; X = DataTable{:,varNames}; y = DataTable{:,'GNPR'}; ```

Create a normal-inverse-gamma conjugate prior model for the linear regression parameters. Specify the number of predictors, `p`, and the variable names.

```p = 3; PriorMdl = bayeslm(p,'ModelType','conjugate','VarNames',varNames) ```
```PriorMdl = | Mean Std CI95 Positive Distribution ----------------------------------------------------------------------------------- Intercept | 0 70.7107 [-141.273, 141.273] 0.500 t (0.00, 57.74^2, 6) IPI | 0 70.7107 [-141.273, 141.273] 0.500 t (0.00, 57.74^2, 6) E | 0 70.7107 [-141.273, 141.273] 0.500 t (0.00, 57.74^2, 6) WR | 0 70.7107 [-141.273, 141.273] 0.500 t (0.00, 57.74^2, 6) Sigma2 | 0.5000 0.5000 [ 0.138, 1.616] 1.000 IG(3.00, 1) ```

Estimate the posterior distribution. Return posterior estimates of the regression coefficients.

```[PosteriorMdl,estBetaMean,EstBetaCov] = estimate(PriorMdl,X,y); ```
```Method: Analytic posterior distributions Number of observations: 62 Number of predictors: 4 Log marginal likelihood -259.348 | Mean Std CI95 Positive Distribution ----------------------------------------------------------------------------------- Intercept | -24.2494 8.7821 [-41.514, -6.985] 0.003 t (-24.25, 8.65^2, 68) IPI | 4.3913 0.1414 [ 4.113, 4.669] 1.000 t (4.39, 0.14^2, 68) E | 0.0011 0.0003 [ 0.000, 0.002] 1.000 t (0.00, 0.00^2, 68) WR | 2.4683 0.3490 [ 1.782, 3.154] 1.000 t (2.47, 0.34^2, 68) Sigma2 | 44.1347 7.8020 [31.427, 61.855] 1.000 IG(34.00, 0.00069) ```

The display includes the marginal posterior distributions. `estBetaMean` is a 4-by-1 vector representing the mean of the marginal posterior of . `estBetaCov` is a 4-by-4 matrix representing the covariance matrix of the posterior of .

Draw 10000 parameter values from the posterior distribution.

```rng(1); % For reproducibility [BetaSim,sigma2Sim] = simulate(PosteriorMdl,'NumDraws',1e5); ```

`BetaSim` is a 4-by-10000 matrix of randomly drawn regression coefficients. `sigma2Sim` is a 1-by-10000 vector of randomly drawn disturbance variances.

Transpose and standardize the matrix of regression coefficients. Compute the correlation matrix of the regression coefficients.

```estBetaStd = sqrt(diag(EstBetaCov)'); BetaSim = BetaSim'; BetaSimStd = (BetaSim - estBetaMean')./estBetaStd; BetaCorr = corrcov(EstBetaCov); BetaCorr = (BetaCorr + BetaCorr')/2; % Enforce symmetry ```

Because the marginal posterior distributions are known, evaluate the posterior pdf at all simulated values.

```betaPDF = mvtpdf(BetaSimStd,BetaCorr,68); a = 34; b = 0.00069; igPDF = @(x,ap,bp)1./(gamma(ap).*bp.^ap).*x.^(-ap-1).*exp(-1./(x.*bp));... % Inverse gamma pdf sigma2PDF = igPDF(sigma2Sim,a,b); ```

Find the simulated values that maximize the respective pdfs, that is, the posterior modes.

```[~,idxMAPBeta] = max(betaPDF); [~,idxMAPSigma2] = max(sigma2PDF); betaMAP = BetaSim(idxMAPBeta,:); sigma2MAP = sigma2Sim(idxMAPSigma2); ```

`betaMAP` and `sigma2MAP` are the MAP estimates.

Because the posterior of is symmetric and unimodal, the posterior mean and MAP should be the same. Compare the MAP estimate of with its posterior mean.

```table(betaMAP',PosteriorMdl.Mu,'VariableNames',{'MAP','Mean'},... 'RowNames',PriorMdl.VarNames) ```
```ans = 4x2 table MAP Mean _________ _________ Intercept -24.559 -24.249 IPI 4.3964 4.3913 E 0.0011389 0.0011202 WR 2.4473 2.4683 ```

The estimates are fairly close to one another.

Estimate the analytical mode of the posterior of . Compare it to the estimated MAP of .

```igMode = 1/(b*(a+1)) sigma2MAP ```
```igMode = 41.4079 sigma2MAP = 41.4075 ```

The estimates are also fairly close.

## Input Arguments

collapse all

Bayesian linear regression model, specified as an object in this table.

ValueBayesian Linear Regression Model Description
`conjugateblm` model objectDependent, normal-inverse-gamma conjugate model returned by `bayeslm` or `estimate`
`semiconjugateblm` model objectIndependent, normal-inverse-gamma semiconjugate model returned by `bayeslm`
`diffuseblm` model objectDiffuse prior model returned by `bayeslm`
`empiricalblm` model objectSamples from prior distributions characterize prior model returned by `bayeslm` or `estimate`
`customblm` model objectPrior distribution function that you declare returned by `bayeslm`

Typically, model objects returned by `estimate` represent marginal posterior distributions. When you estimate a posterior using `estimate`, if you specify estimation of a conditional posterior, then `estimate` returns the prior model.

If `Mdl` is a `diffuseblm` model, then you must also supply `X` and `y` because `simulate` cannot draw from an improper prior distribution.

Predictor data for the multiple linear regression model, specified as a `numObservations`-by-`PriorMdl.NumPredictors` numeric matrix. `numObservations` is the number of observations, and must be equal to the length of `y`.

If `Mdl` is a posterior distribution, then the columns of `X` must correspond to the columns of the predictor data used to estimate the posterior.

Data Types: `double`

Response data for the multiple linear regression model, specified as a numeric vector with `numObservations` elements.

Data Types: `double`

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside single quotes (`' '`). You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `'Sigma2',2` specifies simulating from the conditional posterior distribution of the regression coefficients given the data and that the specified disturbance variance is `2`.

#### Options for All Models

collapse all

Number of draws to sample from the distribution `Mdl`, specified as the comma-separated pair consisting of `'NumDraws'` and a positive integer.

### Tip

If `Mdl` is an `empiricalblm` or a `customblm`, then it is good practice to specify a burn-in period using `BurnIn` and a thinning multiplier using `Thin`. For details on the adjusted sample size, see Algorithms.

Example: `'NumDraws',1e7`

Data Types: `double`

#### Options for All Models Except Empirical

collapse all

Value of the regression coefficients for simulation from conditional distribution of the disturbance variance, specified as the comma-separated pair consisting of `'Beta'` and a (`Mdl.Intercept` + `Mdl.NumPredictors`)-by-1 numeric vector. That is, when using a posterior distribution, `simulate` draws from π(σ2|`y`,`X`,β = `Beta`), where `y` is `y`, `X` is `X`, and `Beta` is the value of `'Beta'`. If `Mdl.Intercept` is `true`, then `Beta(1)` corresponds to the model intercept. All other values correspond to the predictor variables composing the columns of `X`.

You cannot specify `Beta` and `Sigma2` simultaneously.

By default, `simulate` does not draw from the conditional posterior of σ2.

Example: `'Beta',1:3`

Data Types: `double`

Value of the disturbance variance for simulation from the conditional distribution of the regression coefficients, specified as the comma-separated pair consisting of `'Sigma2'` and a positive numeric scalar. That is, when using a posterior distribution, `simulate` draws from π(β|`y`,`X`,`Sigma2`), where `y` is `y`, `X` is `X`, and `Sigma2` is the value of `'Sigma2'`.

You cannot specify `Beta` and `Sigma2` simultaneously.

By default, `simulate` does not draw from the conditional posterior of β.

Example: `'Sigma2',1`

Data Types: `double`

#### Options for Semiconjugate and Custom Prior Distributions

collapse all

Number of draws to remove from beginning of sample to reduce transient effects, specified as the comma-separated pair consisting of `'BurnIn'` and nonnegative scalar. For details on how `simulate` reduces the full sample, see Algorithms.

### Tip

To help you specify the appropriate burn-in period size, determine the extent of the transient behavior in the sample by specifying `'BurnIn',0`, simulating a few thousand observations using `simulate`, and then plotting the paths.

Example: `'BurnIn',0`

Data Types: `double`

Adjusted sample size multipler, specified as the comma-separated pair consisting of `'Thin'` and a positive integer.

The actual sample size is `BurnIn` + `NumDraws``*Thin`. After discarding the burn-in, `simulate` discards every `Thin``1` draws, and then retains the next. For more details on how `simulate` reduces the full sample, see Algorithms.

### Tip

To reduce potential large serial correlation in the Monte Carlo sample or to reduce the memory consumption of the draws stored in `PosteriorMdl`, specify a large value of `Thin`.

Example: `'Thin',5`

Data Types: `double`

Starting values of the regression coefficients for the sample, specified as the comma-separated pair consisting of `'BetaStart'` and a numeric column vector with (`Mdl.Intercept` + `Mdl.NumPredictors`) elements. By default, `BetaStart` is the ordinary least squares estimate.

### Tip

It is good practice to run `simulate` many times using different parameter starting values. Verify that your estimates from each run converge to similar values.

Example: `'BetaStart',[1; 2; 3]`

Data Types: `double`

Starting values of the disturbance variance for the sample, specified as the comma-separated pair consisting of `'Sigma2Start'` and a positive numeric scalar. By default, `Sigma2Start` is the residual mean squared error of the OLS estimator.

### Tip

It is good practice to run `simulate` many times using different parameter starting values. Verify that your estimates from each run converge to similar values.

Example: `'Sigma2Start',4`

Data Types: `double`

#### Options for Custom Prior Distributions

collapse all

Reparameterize σ2 as log(σ2) during posterior estimation and simulation, specified as the comma-separated pair consisting of `'Reparameterize'` and a value in this table.

ValueDescription
`false``simulate` does not reparameterize σ2.
`true``simulate` reparameterizes σ2 as log(σ2). `simulate` converts results back to the original scale, and does not change the functional form of `PriorMdl.LogPDF`.

### Tip

If you experience numerical instabilities during the posterior estimation or simulation of σ2, then specify `'Reparameterize',true`.

Example: `'Reparameterize',true`

Data Types: `logical`

MCMC sampler, specified as the comma-separated pair consisting of `'Sampler'` and a value in this table.

ValueDescription
`'slice'`Slice sampler
`'metropolis'`Random walk Metropolis sampler
`'hmc'`Hamiltonian Monte Carlo (HMC) sampler

### Tip

• To increase the quality of the MCMC draws, tune the sampler.

1. Before calling `simulate`, specify the tuning parameters and their values by using `sampleroptions`. For example, to specify the slice sampler width `width`, use

`options = sampleroptions('Sampler',"slice",'Width',width);`

2. Specify the object containing the tuning-parameter specifications returned by `sampleroptions` by using the `'Options'` name-value pair argument. For example, to use the tuning-parameter specifications in `options`, specify

``'Options',options``

• If you specify the HMC sampler, then a best practice is to provide the gradient for some variables, at least. `simulate` resorts the numerical computation of any missing partial derivatives (`NaN` values) in the gradient vector.

Example: `'Sampler',"hmc"`

Data Types: `string`

Sampler options, specified as the comma-separated pair consisting of `'Options'` and a structure array returned by `sampleroptions`. Use `'Options'` to specify the MCMC sampler and its tuning parameter values.

Example: `'Options',sampleroptions('Sampler',"hmc")`

Data Types: `struct`

Typical sampling-interval width around the current value in the marginal distributions for the slice sampler, specified as the comma-separated pair consisting of `'Width'` and a positive numeric scalar or a (`PriorMdl.Intercept` + `PriorMdl.NumPredictors` + `1`)-by-1 numeric vector of positive values. The first element corresponds to the model intercept, if one exists in the model. The next `PriorMdl.NumPredictors` elements correspond to the coefficients of the predictor variables ordered by the predictor data columns. The last element corresponds to the model variance.

• If `Width` is a scalar, then `simulate` applies `Width` to all `PriorMdl.NumPredictors` + `PriorMdl.Intercept` + `1` marginal distributions.

• If `Width` is a numeric vector, then `simulate` applies the first element to the intercept (if one exists), the next `PriorMdl.NumPredictors` elements to the regression coefficients corresponding to the predictor variables in `X`, and the last element to the disturbance variance.

• If the sample size (`size(X,1)`) is less than 100, then `Width` is `10` by default.

• If the sample size is at least 100, then, by default, `simulate` sets `Width` to the vector of corresponding posterior standard deviations, assuming a diffuse prior model (`diffuseblm`).

`simulate` dispatches `Width` to the `slicesample` function. For more details, see `slicesample`.

### Tip

• For maximum flexibility, specify the slice sampler width `width` by using the `'Options'` name-value pair argument. For example:

`'Options',sampleroptions('Sampler',"slice",'Width',width)`

• The typical width of the slice sampler does not affect convergence of the MCMC sample. However, it does affect the number of required function evaluations, that is, the efficiency of the algorithm. If the width is too small, then the algorithm can implement an excessive number of function evaluations to determine the appropriate sampling width. If the width is too large, then the algorithm might have to decrease the width to an appropriate size, which requires function evaluations.

Example: `'Width',[100*ones(3,1);10]`

## Output Arguments

collapse all

Simulated regression coefficients, returned as a (`Mdl.Intercept` + `Mdl.NumPredictors`)-by-`NumDraws` numeric matrix. The first row corresponds to the intercept, if one exists in the model, and all other rows correspond the regression coefficients of the predictor variables composing the columns of `X`. Columns correspond to individual, successive, independent draws from the distribution.

Simulated disturbance variance, returned as a 1-by-`NumDraws` numeric vector of positive values. Columns are commensurate with the columns of `BetaSim`.

## Limitations

• `simulate` cannot draw values from an improper distribution, that is, a distribution whose density does not integrate to 1.

• If `Mdl` is an `empiricalblm` model object, then you cannot specify `Beta` or `Sigma2`. That is, you cannot simulate from the conditional posterior distributions using an empirical distribution.

collapse all

### Bayesian Linear Regression Model

A Bayesian linear regression model treats the parameters β and σ2 in the multiple linear regression (MLR) model yt = xtβ + εt as random variables.

For times t = 1,...,T:

• yt is the observed response.

• xt is a 1-by-(p + 1) row vector of observed values of p predictors. To accommodate a model intercept, x1t = 1 for all t.

• β is a (p + 1)-by-1 column vector of regression coefficients corresponding to the variables composing the columns of xt.

• εt is the random disturbance having a mean of zero and Cov(ε) = σ2IT×T, while ε is a T-by-1 vector containing all disturbances. These assumptions imply that the data likelihood is

`$\ell \left(\beta ,{\sigma }^{2}|y,x\right)=\prod _{t=1}^{T}\varphi \left({y}_{t};{x}_{t}\beta ,{\sigma }^{2}\right).$`
ϕ(yt;xtβ,σ2) is the Gaussian probability density with mean xtβ and variance σ2 evaluated at yt;.

Before considering the data, a joint prior distribution assumption is imposed on (β,σ2). In a Bayesian analysis, the beliefs about the distribution of the parameters are updated using information about the parameters gleaned from the likelihood of the data. The result is the joint posterior distribution of (β,σ2) or the conditional posterior distributions of the parameters.

## Algorithms

• Whenever `simulate` must estimate a posterior distribution (for example, when `Mdl` represents a prior distribution and you supply `X` and `y`) and the posterior is analytically tractable, `simulate` simulates directly from the posterior. Otherwise, `simulate` resorts to Monte Carlo simulation to estimate the posterior. For more details, see Posterior Estimation and Inference.

• If `Mdl` is a joint posterior model, then `simulate` simulates data from it differently than if `Mdl` is a joint prior model and you supply `X` and `y`. Hence, if you set the same random seed, and then generate random values both ways, then you will not obtain the same values. However, corresponding empirical distributions based on a sufficient number of draws are effectively equivalent.

• This figure describes how `simulate` reduces the sample using the values of `NumDraws`, `Thin`, and `BurnIn`.

Rectangles represent successive draws from the distribution. `simulate` removes the white rectangles from the sample. The remaining `NumDraws` black rectangles compose the sample.

• If `Mdl` is a `semiconjugateblm` model object, then `simulate` samples from the posterior distribution by applying the Gibbs sampler. Specifically and by default:

1. `simulate` uses the default value of `Sigma2Start` for σ2, and draws a value of β from π(β|σ2,X,y).

2. `simulate` draws a value of σ2 from π(σ2|β,X,y) using the previously generated value of β.

3. Repeat steps 1 and 2 until convergence. To assess convergence, draw a trace plot of the sample.

If you specify `BetaStart`, then `simulate` draws a value of σ2 from π(σ2|β,X,y) to start the Gibbs sampler. `simulate` does not return this generated value of σ2.

• If `Mdl` is an `empiricalblm` model object and you do not supply `X` and `y`, then `simulate` draws from `Mdl.BetaDraws` and `Mdl.Sigma2Draws`. If `NumDraws` is fewer than or equal to `numel(Mdl.Sigma2Draws)`, then `simulate` returns the first `NumDraws` elements of `Mdl.BetaDraws` and `Mdl.Sigma2Draws` as random draws for the corresponding parameter. Otherwise, `simulate` randomly resamples `NumDraws` elements from `Mdl.BetaDraws` and `Mdl.Sigma2Draws`.

• If `Mdl` is a `customblm` model object, then `simulate` uses an MCMC sampler to draw from the posterior distribution. At each iteration, the software concatenates the current values of the regression coefficients and disturbance variance into a (`Mdl.Intercept` + `Mdl.NumPredictors` + 1)-by-1 vector, and passes it to `Mdl.LogPDF`. The value of the disturbance variance is the last element of this vector.

• The HMC sampler requires both the log density and its gradient. The gradient should be a `(NumPredictors+Intercept+1)`-by-1 vector. If the derivatives of certain parameters are difficult to compute, then, in the corresponding locations of the gradient, supply `NaN` values instead. `simulate` replaces `NaN` values with numerical derivatives.