# Documentation

### This is machine translation

Translated by
Mouseover text to see original. Click the button below to return to the English verison of the page.

# diffuseblm

Bayesian linear regression model with diffuse conjugate prior for data likelihood

## Description

The Bayesian linear regression model object `diffuseblm` specifies that the joint prior distribution of (β,σ2) is proportional to 1/σ2 (the diffuse prior model).

The data likelihood is $\prod _{t=1}^{T}\varphi \left({y}_{t};{x}_{t}\beta ,{\sigma }^{2}\right),$ where ϕ(yt;xtβ,σ2) is the Gaussian probability density evaluated at yt with mean xtβ and variance σ2. The resulting marginal and conditional posterior distributions are analytically tractable. For details on the posterior distribution, see Analytically Tractable Posteriors.

In general, when you create a Bayesian linear regression model object, it specifies the joint prior distribution and characteristics of the linear regression model only. That is, the model object is a template intended for further use. Specifically, to incorporate data into the model for posterior distribution analysis, pass the model object and data to the appropriate object function.

## Creation

### Syntax

``PriorMdl = diffuseblm(NumPredictors)``
``PriorMdl = diffuseblm(NumPredictors,Name,Value)``

### Description

example

````PriorMdl = diffuseblm(NumPredictors)` creates a Bayesian linear regression model object (`PriorMdl`) composed of `NumPredictors` predictors and an intercept. The joint prior distribution of (β, σ2) is the diffuse model. `PriorMdl` is a template defining the prior distributions and dimensionality of β.```

example

````PriorMdl = diffuseblm(NumPredictors,Name,Value)` uses additional options specified by one or more `Name,Value` pair arguments. `Name` is a property name, except `NumPredictors`, and `Value` is the corresponding value. `Name` must appear inside single quotes (`''`). You can specify several `Name,Value` pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.```

## Properties

expand all

You can set property values when you create the model object using name-value pair argument syntax, or after model creation using dot notation. For example, to specify that there is no model intercept in `PriorMdl`, a Baysian linear regression model containing three model coefficients, enter

`PriorMdl.Intercept = false;`

Number of predictor variables in the Bayesian multiple linear regression model, specified as a nonnegative integer.

`NumPredictors` must be the same as the number of columns in your predictor data, which you specify during model estimation or simulation.

When specifying `NumPredictors`, exclude any intercept term for the value.

After creating a model, if you change the value `NumPredictors` using dot notation, then `VarNames` reverts to its default value.

Example: `'NumPredictors',2`

Data Types: `double`

Indicate whether to include regression model intercept, specified as the comma-separated pair consisting of `'Intercept'` and a value in this table.

ValueDescription
`false`Exclude an intercept from the regression model. Hence, β is a `p`-dimensional vector, where `p` is the value of the `NumPredictors` property.
`true`Include an intercept in the regression model. Hence, β is a (`p` + 1)-dimensional vector. MATLAB® prepends a T-by-1 vector of ones to the predictor data during estimation and simulation.

If you include a column of ones in the predictor data for an intercept term, then set `Intercept` to `false`.

Example: `'Intercept',false`

Predictor variable names for displays, specified as a string vector or cell vector of character vectors. `VarNames` must contain `NumPredictors` elements. `VarNames(j)` is the name of variable in column `j` of the predictor data set, which you specify during estimation, simulation, and forecasting.

The default is `{'Beta(1)','Beta(2),...,Beta(p)}`, where `p` is the value of `NumPredictors`.

Example: `'VarNames',["UnemploymentRate"; "CPI"]`

Data Types: `string` | `cell` | `char`

## Object Functions

 `estimate` Fit parameters of Bayesian linear regression model to data `simulate` Simulate regression coefficients and disturbance variance of Bayesian linear regression model `forecast` Forecast responses of Bayesian linear regression model `plot` Visualize prior and posterior densities of Bayesian linear regression model parameters `summarize` Distribution summary statistics of Bayesian linear regression model

## Examples

expand all

Consider the multiple linear regression model that predicts U.S. real gross national product (`GNPR`) using a linear combination of industrial production index (`IPI`), total employment (`E`), and real wages (`WR`).

For all , is a series of independent Gaussian disturbances with a mean of 0 and variance .

Suppose that the regression coefficients and the disturbance variance are random variables, and you have no prior knowledge of their values or distribution. That is, you want to use the noninformative Jefferys prior: the joint prior distribution is proportional to .

These assumptions and the data likelihood imply an analytically tractable posterior distribution.

Create a diffuse prior model for the linear regression parameters, which is the default prior model type. Specify the number of predictors, `p`.

```p = 3; Mdl = bayeslm(p) ```
```Mdl = | Mean Std CI95 Positive Distribution ----------------------------------------------------------------------------- Intercept | 0 Inf [ NaN, NaN] 0.500 Proportional to one Beta(1) | 0 Inf [ NaN, NaN] 0.500 Proportional to one Beta(2) | 0 Inf [ NaN, NaN] 0.500 Proportional to one Beta(3) | 0 Inf [ NaN, NaN] 0.500 Proportional to one Sigma2 | Inf Inf [ NaN, NaN] 1.000 Proportional to 1/Sigma2 ```

`Mdl` is a `diffuseblm` Bayesian linear regression model object representing the prior distribution of the regression coefficients and disturbance variance. At the command window, `bayeslm` displays a summary of the prior distributions. Because the prior is noninformative and the data have not been incorporated yet, the summary is trivial.

You can set writable property values of created models using dot notation. Specify the regression coefficient names to the corresponding variable names.

```Mdl.VarNames = ["IPI" "E" "WR"] ```
```Mdl = | Mean Std CI95 Positive Distribution ----------------------------------------------------------------------------- Intercept | 0 Inf [ NaN, NaN] 0.500 Proportional to one IPI | 0 Inf [ NaN, NaN] 0.500 Proportional to one E | 0 Inf [ NaN, NaN] 0.500 Proportional to one WR | 0 Inf [ NaN, NaN] 0.500 Proportional to one Sigma2 | Inf Inf [ NaN, NaN] 1.000 Proportional to 1/Sigma2 ```

This example is based on Create Diffuse Prior Model.

Create a diffuse prior model for the linear regression parameters. Specify the number of predictors, `p`, and the names of the regression coefficients.

```p = 3; PriorMdl = bayeslm(p,'ModelType','diffuse','VarNames',["IPI" "E" "WR"]); ```

Load the Nelson-Plosser data set. Create variables for the response and predictor series.

```load Data_NelsonPlosser X = DataTable{:,PriorMdl.VarNames(2:end)}; y = DataTable{:,'GNPR'}; ```

Estimate the marginal posterior distributions of and .

```PosteriorMdl = estimate(PriorMdl,X,y); ```
```Method: Analytic posterior distributions Number of observations: 62 Number of predictors: 4 | Mean Std CI95 Positive Distribution ------------------------------------------------------------------------------------ Intercept | -24.2536 9.5314 [-43.001, -5.506] 0.006 t (-24.25, 9.37^2, 58) IPI | 4.3913 0.1535 [ 4.089, 4.693] 1.000 t (4.39, 0.15^2, 58) E | 0.0011 0.0004 [ 0.000, 0.002] 0.999 t (0.00, 0.00^2, 58) WR | 2.4682 0.3787 [ 1.723, 3.213] 1.000 t (2.47, 0.37^2, 58) Sigma2 | 51.9790 10.0034 [35.965, 74.937] 1.000 IG(29.00, 0.00069) ```

`PosteriorMdl` is a `conjugateblm` model object storing the joint marginal posterior distribution of and given the data. `estimate` displays a summary of the marginal posterior distributions to the command window. Rows of the summary correspond to regression coefficients and the disturbance variance, and columns to characteristics of the posterior distribution. The characteristics include:

• `CI95`, which contains the 95% Bayesian equitailed credible intervals for the parameters. For example, the posterior probability that the regression coefficient of `WR` is in [1.723, 3.213] is 0.95.

• `Positive`, which contains the posterior probability that the parameter is greater than 0. For example, the probability that the intercept is greater than 0 is 0.006.

• `Distribution`, which contains descriptions of the posterior distributions of the parameters. For example, the marginal posterior distribution of `IPI` is t with a mean of 4.39, a standard deviation of 0.15, and 58 degrees of freedom.

Access properties of the posterior distribution using dot notation. For example, display the marginal posterior means by accessing the `Mu` property.

```PosteriorMdl.Mu ```
```ans = -24.2536 4.3913 0.0011 2.4682 ```

This example is based on Create Diffuse Prior Model.

Create a diffuse prior model for the linear regression parameters. Specify the number of predictors, `p`, and the names of the regression coefficients.

```p = 3; PriorMdl = bayeslm(p,'ModelType','diffuse','VarNames',["IPI" "E" "WR"]) ```
```PriorMdl = | Mean Std CI95 Positive Distribution ----------------------------------------------------------------------------- Intercept | 0 Inf [ NaN, NaN] 0.500 Proportional to one IPI | 0 Inf [ NaN, NaN] 0.500 Proportional to one E | 0 Inf [ NaN, NaN] 0.500 Proportional to one WR | 0 Inf [ NaN, NaN] 0.500 Proportional to one Sigma2 | Inf Inf [ NaN, NaN] 1.000 Proportional to 1/Sigma2 ```

Load the Nelson-Plosser data set. Create variables for the response and predictor series.

```load Data_NelsonPlosser X = DataTable{:,PriorMdl.VarNames(2:end)}; y = DataTable{:,'GNPR'}; ```

Estimate the conditional posterior distributions of given and the data.

```[Mdl,condPostMeanBeta,CondPostCovBeta] = estimate(PriorMdl,X,y,... 'Sigma2',2); ```
```Method: Analytic posterior distributions Conditional variable: Sigma2 fixed at 2 Number of observations: 62 Number of predictors: 4 | Mean Std CI95 Positive Distribution -------------------------------------------------------------------------------- Intercept | -24.2536 1.8696 [-27.918, -20.589] 0.000 N (-24.25, 1.87^2) IPI | 4.3913 0.0301 [ 4.332, 4.450] 1.000 N (4.39, 0.03^2) E | 0.0011 0.0001 [ 0.001, 0.001] 1.000 N (0.00, 0.00^2) WR | 2.4682 0.0743 [ 2.323, 2.614] 1.000 N (2.47, 0.07^2) Sigma2 | 2 0 [ 2.000, 2.000] 1.000 Fixed value ```

`estimate` returns the 4-by-1 vector of means and the 4-by-4 covariance matrix of the conditional posterior distribution of the in `condPostMeanBeta` and `CondPostCovBeta`, respectively. Also, `estimate` displays a summary of the conditional posterior distribution of .

Display `Mdl`.

```Mdl ```
```Mdl = | Mean Std CI95 Positive Distribution ----------------------------------------------------------------------------- Intercept | 0 Inf [ NaN, NaN] 0.500 Proportional to one IPI | 0 Inf [ NaN, NaN] 0.500 Proportional to one E | 0 Inf [ NaN, NaN] 0.500 Proportional to one WR | 0 Inf [ NaN, NaN] 0.500 Proportional to one Sigma2 | Inf Inf [ NaN, NaN] 1.000 Proportional to 1/Sigma2 ```

Because `estimate` computes the conditional posterior distribution, it returns the original prior model, not the posterior, in the first position of the output argument list.

This example is based on Estimate Marginal Posterior Distributions.

Create and estimate the marginal posterior distributions. Extract posterior distribution summary statistics for .

```p = 3; PriorMdl = bayeslm(p,'ModelType','diffuse','VarNames',["IPI" "E" "WR"]); load Data_NelsonPlosser X = DataTable{:,PriorMdl.VarNames(2:end)}; y = DataTable{:,'GNPR'}; [PosteriorMdl,estBeta,estBetaCov] = estimate(PriorMdl,X,y); ```
```Method: Analytic posterior distributions Number of observations: 62 Number of predictors: 4 | Mean Std CI95 Positive Distribution ------------------------------------------------------------------------------------ Intercept | -24.2536 9.5314 [-43.001, -5.506] 0.006 t (-24.25, 9.37^2, 58) IPI | 4.3913 0.1535 [ 4.089, 4.693] 1.000 t (4.39, 0.15^2, 58) E | 0.0011 0.0004 [ 0.000, 0.002] 0.999 t (0.00, 0.00^2, 58) WR | 2.4682 0.3787 [ 1.723, 3.213] 1.000 t (2.47, 0.37^2, 58) Sigma2 | 51.9790 10.0034 [35.965, 74.937] 1.000 IG(29.00, 0.00069) ```

Suppose that if the coefficient of real wages is below 2.5, then a policy is enacted. Although the posterior distribution of `WR` is known, and so you can calculate probabilities directly, you can estimate the probability using Monte Carlo simulation instead.

Draw `1e6` samples from the marginal posterior distribution of .

```NumDraws = 1e6; rng(1); BetaSim = simulate(PosteriorMdl,'NumDraws',NumDraws); ```

`BetaSim` is a 4-by- `1e6` matrix containing the draws. Rows correspond to the regression coefficient and columns to successive draws.

Isolate the draws corresponding to the coefficient of real wages, and then identify which draws are less than 2.5.

```isWR = PosteriorMdl.VarNames == "WR"; wrSim = BetaSim(isWR,:); isWRLT2p5 = wrSim < 2.5; ```

Find the marginal posterior probability that the regression coefficient of `WR` is below 2.5 by computing the proportion of draws that are less than 2.5.

```probWRLT2p5 = mean(isWRLT2p5) ```
```probWRLT2p5 = 0.5341 ```

The posterior probability that the coefficient of real wages is less than 2.5 is about `0.53`.

The marginal posterior distribution of the coefficient of `WR` is a , but centered at 2.47 and scaled by 0.37. Directly compute the posterior probability that the coefficient of `WR` is less than 2.5.

```center = estBeta(isWR); stdBeta = sqrt(diag(estBetaCov)); scale = stdBeta(isWR); t = (2.5 - center)/scale; dof = 68; directProb = tcdf(t,dof) ```
```directProb = 0.5333 ```

The posterior probabilities are nearly identical.

This example is based on Estimate Marginal Posterior Distributions.

Create and estimate the marginal posterior distributions. Hold out the last 10 periods of data from estimation so that they can be used for forecasting real GNP. Turn the estimation display off.

```p = 3; PriorMdl = bayeslm(p,'ModelType','diffuse','VarNames',["IPI" "E" "WR"]); load Data_NelsonPlosser fhs = 10; % Forecast horizon size X = DataTable{1:(end - fhs),PriorMdl.VarNames(2:end)}; y = DataTable{1:(end - fhs),'GNPR'}; XF = DataTable{(end - fhs + 1):end,PriorMdl.VarNames(2:end)}; % Future predictor data yFT = DataTable{(end - fhs + 1):end,'GNPR'}; % True future responses PosteriorMdl = estimate(PriorMdl,X,y,'Display',false);```

Forecast responses using the posterior predictive distribution and using the future predictor data `XF`. Plot the true values of the response and the forecasted values.

```yF = forecast(PosteriorMdl,XF); figure; plot(dates,DataTable.GNPR); hold on plot(dates((end - fhs + 1):end),yF) h = gca; hp = patch([dates(end - fhs + 1) dates(end) dates(end) dates(end - fhs + 1)],... h.YLim([1,1,2,2]),[0.8 0.8 0.8])```
```hp = Patch with properties: FaceColor: [0.8000 0.8000 0.8000] FaceAlpha: 1 EdgeColor: [0 0 0] LineStyle: '-' Faces: [1 2 3 4] Vertices: [4x2 double] Show all properties ```
```uistack(hp,'bottom'); legend('True GNPR','Forecasted GNPR','Forecast Horizon','Location','NW') title('Real Gross National Product: 1909 - 1970'); ylabel('rGNP'); xlabel('Year'); hold off```

`yF` is a 10-by-1 vector of future values of real GNP corresponding to the future predictor data.

Estimate the forecast root mean squared error (RMSE).

`frmse = sqrt(mean((yF - yFT).^2))`
```frmse = 25.5489 ```

Forecast RMSE is a relative measure of forecast accuracy. Specifically, you estimate several models using different assumptions. The model with the lowest forecast RMSE is the best performing model of the ones being compared.

expand all

## Alternatives

You can also create a Bayesian linear regression model with a diffuse prior using `bayeslm`.