Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

bayeslm

Create Bayesian linear regression model object

Syntax

PriorMdl = bayeslm(NumPredictors)
PriorMdl = bayeslm(NumPredictors,'ModelType',modelType)
PriorMdl = bayeslm(NumPredictors,'ModelType',modelType,Name,Value)

Description

example

PriorMdl = bayeslm(NumPredictors) creates a Bayesian linear regression model object (PriorMdl) composed of NumPredictors predictors, an intercept, and a diffuse, joint prior distribution for β and σ2. PriorMdl is a template that defines the prior distributions and dimensionality of β.

example

PriorMdl = bayeslm(NumPredictors,'ModelType',modelType) specifies the joint prior distribution modelType for β and σ2. For this syntax, modelType can be 'conjugate', 'semiconjugate', or 'diffuse'. For example, 'ModelType','conjugate' specifies conjugate priors for the Gaussian likelihood, that is, β|σ2 as Gaussian, σ2 as inverse gamma.

example

PriorMdl = bayeslm(NumPredictors,'ModelType',modelType,Name,Value) uses additional options specified by one or more Name,Value pair arguments. For example, you can specify whether to include a regression intercept or specify additional options for the joint prior distribution modelType.

  • If you specify 'ModelType','empirical', you must also specify the BetaDraws and Sigma2Draws name-value pair arguments. BetaDraws and Sigma2Draws characterize the respective prior distributions.

  • If you specify 'ModelType','custom', you must also specify the LogPDF name-value pair argument. LogPDF completely characterizes the joint prior distribution.

Examples

collapse all

Consider the multiple linear regression model that predicts U.S. real gross national product (GNPR) using a linear combination of industrial production index (IPI), total employment (E), and real wages (WR).

For all , is a series of independent Gaussian disturbances with a mean of 0 and variance .

Suppose that the regression coefficients and the disturbance variance are random variables, and you have no prior knowledge of their values or distribution. That is, you want to use the noninformative Jefferys prior: the joint prior distribution is proportional to .

These assumptions and the data likelihood imply an analytically tractable posterior distribution.

Create a diffuse prior model for the linear regression parameters, which is the default model type. Specify the number of predictors, p.

p = 3;
Mdl = bayeslm(p)
Mdl = 

 
           | Mean  Std        CI95        Positive        Distribution       
-----------------------------------------------------------------------------
 Intercept |  0    Inf  [   NaN,    NaN]    0.500   Proportional to one      
 Beta(1)   |  0    Inf  [   NaN,    NaN]    0.500   Proportional to one      
 Beta(2)   |  0    Inf  [   NaN,    NaN]    0.500   Proportional to one      
 Beta(3)   |  0    Inf  [   NaN,    NaN]    0.500   Proportional to one      
 Sigma2    |  Inf  Inf  [   NaN,    NaN]    1.000   Proportional to 1/Sigma2 
 

Mdl is a diffuseblm Bayesian linear regression model object representing the prior distribution of the regression coefficients and disturbance variance. At the command window, bayeslm displays a summary of the prior distributions. Because the prior is noninformative and the data have not been incorporated yet, the summary is trivial.

If you have data, then you can estimate characteristics of the posterior distribution by passing the prior model Mdl and data to estimate.

Consider the multiple linear regression model from Default Diffuse Prior Model, but instead assume that the prior distributions are:

  • . is a 4-by-1 vector of means and is a scaled 4-by-4 positive definite covariance matrix.

  • . and are the shape and scale, respectively, of an inverse gamma distribution.

These assumptions and the data likelihood imply a normal-inverse-gamma semiconjugate model. That is, the conditional posteriors are conjugate to the prior with respect to the data likelihood, but the marginal posterior is analytically intractable.

Create a normal-inverse-gamma semiconjugate prior model for the linear regression parameters. Specify the number of predictors, p.

p = 3;
Mdl = bayeslm(p,'ModelType','semiconjugate')
Mdl = 

 
           |  Mean     Std           CI95         Positive     Distribution    
-------------------------------------------------------------------------------
 Intercept |  0       100    [-195.996, 195.996]    0.500   N (0.00, 100.00^2) 
 Beta(1)   |  0       100    [-195.996, 195.996]    0.500   N (0.00, 100.00^2) 
 Beta(2)   |  0       100    [-195.996, 195.996]    0.500   N (0.00, 100.00^2) 
 Beta(3)   |  0       100    [-195.996, 195.996]    0.500   N (0.00, 100.00^2) 
 Sigma2    | 0.5000  0.5000    [ 0.138,  1.616]     1.000   IG(3.00,    1)     
 

Mdl is a semiconjugateblm Bayesian linear regression model object representing the prior distribution of the regression coefficients and disturbance variance. At the command window, bayeslm displays a summary of the prior distributions. For example, the elements of Positive represent the prior probability that the corresponding parameter is positive.

If you have data, then you can estimate characteristics of the marginal or conditional posterior distribution by passing the prior model Mdl and data to estimate.

Consider the multiple linear regression model from Default Diffuse Prior Model, but instead assume that the prior distributions are:

  • . is a 4-by-1 vector of means and is a scaled 4-by-4 positive definite covariance matrix. Suppose that from past experience, and V is the identity matrix.

  • . and are the shape and scale, respectively, of an inverse gamma distribution.

These assumptions and the data likelihood imply a normal-inverse-gamma conjugate model.

Create a normal-inverse-gamma conjugate prior model for the linear regression parameters. Specify the number of predictors, p,and set the regression coefficient names to the corresponding variable names.

p = 3;
Mdl = bayeslm(p,'ModelType','conjugate','Mu',[-20; 4; 0.1; 2],'V',eye(4),...
    'VarNames',["IPI" "E" "WR"])
Mdl = 

 
           |  Mean     Std          CI95         Positive       Distribution      
----------------------------------------------------------------------------------
 Intercept |  -20    0.7071  [-21.413, -18.587]    0.000   t (-20.00, 0.58^2,  6) 
 IPI       |  4      0.7071   [ 2.587,  5.413]     1.000   t (4.00, 0.58^2,  6)   
 E         | 0.1000  0.7071   [-1.313,  1.513]     0.566   t (0.10, 0.58^2,  6)   
 WR        |  2      0.7071   [ 0.587,  3.413]     0.993   t (2.00, 0.58^2,  6)   
 Sigma2    | 0.5000  0.5000   [ 0.138,  1.616]     1.000   IG(3.00,    1)         
 

Mdl is a conjugateblm Bayesian linear regression model object representing the prior distribution of the regression coefficients and disturbance variance. At the command window, bayeslm displays a summary of the prior distributions. Although bayeslm always chooses names for the intercept and disturbance variance, all other coefficients have the specified names.

By default, bayeslm sets the shape and scale to 3 and 1, respectively. Suppose that from past experience, the shape and scale are 5 and 2.

Set the prior shape and scale of to their assumed values.

Mdl.A = 5;
Mdl.B = 2
Mdl = 

 
           |  Mean     Std          CI95         Positive       Distribution      
----------------------------------------------------------------------------------
 Intercept |  -20    0.3536  [-20.705, -19.295]    0.000   t (-20.00, 0.32^2, 10) 
 IPI       |  4      0.3536   [ 3.295,  4.705]     1.000   t (4.00, 0.32^2, 10)   
 E         | 0.1000  0.3536   [-0.605,  0.805]     0.621   t (0.10, 0.32^2, 10)   
 WR        |  2      0.3536   [ 1.295,  2.705]     1.000   t (2.00, 0.32^2, 10)   
 Sigma2    | 0.1250  0.0722   [ 0.049,  0.308]     1.000   IG(5.00,    2)         
 

bayeslm updates the prior distribution summary based on the changes in the shape and scale.

Consider the multiple linear regression model from Default Diffuse Prior Model, but instead suppose that from prior experience, the distributions are:

  • is 4-dimensional t distribution with 50 degrees of freedom for each component and the identity matirx for the correlation matrix. Also, the distribution is centered at and each component is scaled by the corresponding elements of the vector .

  • .

bayeslm treats these assumptions and the data likelihood as if the corresponding posterior is analytically intractable.

Declare a MATLAB® function that:

  • Accepts values of and together in a column vector, and values of the hyperparameters.

  • Returns the value of the joint prior distribution, , given the values of and .

function logPDF = priorMVTIG(params,ct,st,dof,C,a,b)
%priorMVTIG Log density of multivariate t times inverse gamma 
%   priorMVTIG passes params(1:end-1) to the multivariate t density
%   function with dof degrees of freedom for each component and positive
%   definite correlation matrix C. priorMVTIG returns the log of the product of
%   the two evaluated densities.
%   
%   params: Parameter values at which the densities are evaluated, an
%           m-by-1 numeric vector.
%
%       ct: Multivariate t distribution component centers, an (m-1)-by-1
%           numeric vector.  Elements correspond to the first m-1 elements
%           of params.
%
%       st: Multivariate t distribution component scales, an (m-1)-by-1
%           numeric (m-1)-by-1 numeric vector.  Elements correspond to the
%           first m-1 elements of params.
%
%      dof: Degrees of freedom for the multivariate t distribution, a
%           numeric scalar or (m-1)-by-1 numeric vector. priorMVTIG expands
%           scalars such that dof = dof*ones(m-1,1). Elements of dof
%           correspond to the elements of params(1:end-1).
%
%        C: correlation matrix for the multivariate t distribution, an
%           (m-1)-by-(m-1) symmetric, positive definite matrix. Rows and
%           columns correspond to the elements of params(1:end-1).
% 
%        a: Inverse gamma shape parameter, a positive numeric scalar.
%
%        b: Inverse gamma scale parameter, a positive scalar.
%
beta = params(1:(end-1));
sigma2 = params(end);

tVal = (beta - ct)./st;
mvtDensity = mvtpdf(tVal,C,dof);
igDensity = sigma2^(-a-1)*exp(-1/(sigma2*b))/(gamma(a)*b^a);

logPDF = log(mvtDensity*igDensity);
end


Create an anonymous function that operates like priorMVTIG, accepts the parameter values only and holds the hyperparameter values fixed to their values.

dof = 50;
C = eye(4);
ct = [-25; 4; 0; 3];
st = [10; 1; 1; 1];
a = 3;
b = 1;
prior = @(params)priorMVTIG(params,ct,st,dof,C,a,b);

Create a custom joint prior model for the linear regression parameters. Specify the number of predictors, p. Also, specify the function handle for priorMVTIG, and pass the hyperparameter values.

p = 3;
Mdl = bayeslm(p,'ModelType','custom','LogPDF',prior)
Mdl = 

The priors are defined by the function:
    @(params)priorMVTIG(params,ct,st,dof,C,a,b)

Mdl is a customblm Bayesian linear regression model object representing the prior distribution of the regression coefficients and disturbance variance. Unlike other Bayesian linear regression model objects, bayeslm does not display a summary of the prior distributions at the command window.

Input Arguments

collapse all

Number of predictor variables in the Bayesian multiple linear regression model, specified as a nonnegative integer.

NumPredictors must be the same as the number of columns in your predictor data, which you specify during model estimation or simulation.

When counting the number of predictors in the model, exclude the intercept term specified by Intercept. If you include a column of ones in the predictor data for an intercept term, then count it as a predictor variable and specify 'Intercept',false.

Data Types: double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'ModelType','conjugate','Mu',1:3,'V',1000*eye(3),'A',1,'B',0.5 specifies that the prior distribution of Beta given Sigma2 is Gaussian with mean vector 1:3 and covariance matrix Sigma2*1000*eye(3), and the distribution of Sigma2 is inverse gamma with shape 1 and scale 0.5.

Options for All Prior Distributions

collapse all

Joint prior distribution of (β,σ2), specified as the comma-separated pair consisting of 'ModelType' and a value in this table.

ValueDescription
'conjugate'

Normal-inverse-gamma conjugate model

  • The prior distributions are

    β|σ2~Np+1(μ,σ2V)σ2~IG(A,B)

    β and σ2 are dependent.

  • The corresponding marginal and conditional posterior distributions have closed forms (see Analytically Tractable Posteriors).

You can adjust corresponding hyperparameters using the Mu, V, A, and B name-value pair arguments.

'semiconjugate'

Normal-inverse-gamma semi-conjugate model

  • The prior distributions are

    β|σ2~Np+1(μ,V)σ2~IG(A,B)

    β and σ2 are independent.

  • The corresponding marginal posterior distribution does not have a closed form, but conditional posterior distributions do (see Analytically Tractable Posteriors).

You can adjust corresponding hyperparameters using the Mu, V, A, and B name-value pair arguments.

'diffuse'

Diffuse prior distributions

  • The joint prior pdf is

    fβ,σ2(β,σ2)1σ2.

  • The corresponding marginal and conditional posterior distributions have closed forms (see Analytically Tractable Posteriors).

'empirical'

Custom prior distributions

  • You must also specify the BetaDraws and Sigma2Draws name-value pair arguments.

  • The corresponding marginal and conditional posterior distributions do not have closed forms.

  • Empirical models are better suited for updating a posterior distribution based on new data.

'custom'

Custom prior distributions

  • You must also specify the LogPDF name-value pair argument.

  • The corresponding marginal and conditional posterior distributions do not have closed forms.

The prior model type that you choose depends on your assumptions on the joint distribution of the parameters. Your choice can impact posterior estimates and inferences. For more details, see Bayesian Linear Regression Workflow.

Example: 'ModelType','conjugate'

Data Types: char

Indicate whether to include a regression model intercept, specified as the comma-separated pair consisting of 'Intercept' and a value in this table.

ValueDescription
falseExclude an intercept from the regression model. Hence, β is a p-dimensional vector, where p is the value of NumPredictors.
trueInclude an intercept in the regression model. Hence, β is a (p + 1)-dimensional vector. This specification causes a T-by-1 vector of ones to be prepended to the predictor data during estimation and simulation.

If you include a column of ones in the predictor data for an intercept term, then specify 'Intercept',false.

Example: 'Intercept',false

Predictor variable names for displays, specified as the comma-separated pair consisting of 'VarNames' and a string vector or cell vector of character vectors. VarNames must contain NumPredictors elements. VarNames(j) is the name of the variable in column j of the predictor data set, which you specify during estimation, simulation, and forecasting.

The default is {'Beta(1)','Beta(2)',...,'Beta(p)'}, where p is the value of NumPredictors.

Note

You cannot set the name of the intercept or disturbance variance. In displays, bayeslm gives the intercept the name Intercept and the disturbance variance the name Sigma2. Therefore, you cannot use "Intercept" and "Sigma2" as predictor names.

Example: 'VarNames',["UnemploymentRate"; "CPI"]

Data Types: string | cell | char

Options for Conjugate and Semi-Conjugate Joint Prior Distributions

collapse all

Mean hyperparameter of the Gaussian prior on β, specified as the comma-separated pair consisting of 'Mu' and a numeric scalar or vector.

If Mu is a vector, then it must have NumPredictors or NumPredictors + 1 elements.

  • For NumPredictors elements, bayeslm sets the prior mean of the NumPredictors predictors only. Predictors correspond to the columns in the predictor data (specified during estimation, simulation, or forecasting). bayeslm ignores the intercept in the model, that is, bayeslm specifies the default prior mean to any intercept.

  • For NumPredictors + 1 elements, the first element corresponds to the prior mean of the intercept, and all other elements correspond to the predictors.

Example: 'Mu',[1; 0.08; 2]

Data Types: double

Conditional covariance matrix hyperparameter of Gaussian prior on β, specified as the comma-separated pair consisting of 'V' and a c-by-c symmetric, positive definite matrix. c can be NumPredictors or NumPredictors + 1.

  • If c is NumPredictors, then bayeslm sets the prior covariance matrix to

    [1e5000V0].

    That is, bayeslm attributes the default prior covariances to the intercept, and V to the coefficients of the predictor variables in the data. Rows and columns of V correspond to columns (variables) in the predictor data.

  • If c is NumPredictors + 1, then bayeslm sets the entire prior covariance to V. The first row and column correspond to the intercept. All other rows and columns correspond to the columns in the predictor data.

The default value is a flat prior. For an adaptive prior, specify diag(Inf(Intercept + NumPredictors,1)). Adaptive priors indicate zero precision, that is, you want the prior distribution to have as little influence on the posterior distribution as possible.

For 'ModelType',conjugate, V is the prior covariance of β up to a factor of σ2.

Example: 'V',diag(Inf(3,1))

Data Types: double

Shape hyperparameter of inverse gamma prior on σ2, specified as the comma-separated pair consisting of 'A' and a numeric scalar.

A must be at least -(NumPredictors + Intercept)/2.

With B held fixed, as A increases, the inverse gamma distribution becomes taller and more concentrated. This characteristic weighs the prior model of σ2 more heavily than the likelihood during posterior estimation.

For the functional form of the inverse gamma distribution, see Analytically Tractable Posteriors.

Example: 'A',0.1

Data Types: double

Scale hyperparameter of inverse gamma prior on σ2, specified as the comma-separated pair consisting of 'B' and a positive scalar or Inf.

With A held fixed, as B increases, the inverse gamma distribution becomes taller and more concentrated. This characteristic weighs the prior model of σ2 more heavily than the likelihood during posterior estimation.

Example: 'B',5

Data Types: double

Requirements for Empirical Joint Prior Distributions

collapse all

Random sample from the prior distribution of β, specified as the comma-separated pair consisting of 'BetaDraws' and a (Intercept + NumPredictors)-by-NumDraws numeric matrix. Rows correspond to regression coefficients: the first row corresponds to the intercept and the following rows correspond to columns in the predictor data. Columns correspond to successive draws from the prior distribution.

BetaDraws and Sigma2Draws must have the same number of columns.

For best results, draw a large number of samples.

Data Types: double

Random sample from the prior distribution of σ2, specified as the comma-separated pair consisting of 'Sigma2Draws' and a 1-by-NumDraws numeric row vector. Columns correspond to successive draws from the prior distribution.

BetaDraws and Sigma2Draws must have the same number of columns.

For best results, draw a large number of samples.

Data Types: double

Requirement for Custom Prior Distributions

collapse all

Log of the joint probability density function of (β,σ2), specified as the comma-separated pair consisting of 'LogPDF' and a function handle.

Suppose logprior is the name of the MATLAB® function defining the joint prior distribution of (β,σ2). Then, logprior must have this form.

function [logpdf,glpdf] = logprior(params)
	...
end

  • logpdf is a numeric scalar representing the log of the joint probability density of (β,σ2).

  • glpdf is an (Intercept + NumPredictors + 1)-by-1 numeric vector representing the gradient of logpdf. Elements correspond to the elements of params.

    glpdf is an optional output argument, and only the Hamiltonian Monte Carlo sampler (see hmcSampler) applies it. If you know the analytical partial derivative with respect to some parameters, but not others, then set the elements of glpdf corresponding to the unknown partial derivatives to NaN. MATLAB computes the numerical gradient for missing partial derivatives, which is convenient, but slows sampling.

  • params is an (Intercept + NumPredictors + 1)-by-1 numeric vector. The first Intercept + NumPredictors elements must correspond to values of β, and the last element must correspond to the value of σ2. The first element of β is the intercept, if one exists. All other elements correspond to predictor variables in the predictor data, which you specify during estimation, simulation, or forecasting.

Example: 'LogPDF',@logprior

Output Arguments

collapse all

Bayesian linear regression model storing prior model assumptions, returned as an object in this table.

Value of ModelTypeReturned Bayesian Linear Regression Model Object
'conjugate'conjugateblm
'semiconjugate'semiconjugateblm
'diffuse'diffuseblm
'empirical'empiricalblm
'custom'customblm

PriorMdl specifies the joint prior distribution and characteristics of the linear regression model only. That is, the model object is a template intended for further use. To incorporate data into the model for posterior distribution analysis, pass the model object and data to the appropriate object function, for example, estimate or simulate.

More About

collapse all

Bayesian Linear Regression Model

A Bayesian linear regression model treats the parameters β and σ2 in the multiple linear regression (MLR) model yt = xtβ + εt as random variables.

For times t = 1,...,T:

  • yt is the observed response.

  • xt is a 1-by-(p + 1) row vector of observed values of p predictors. To accommodate a model intercept, x1t = 1 for all t.

  • β is a (p + 1)-by-1 column vector of regression coefficients corresponding to the variables composing the columns of xt.

  • εt is the random disturbance having a mean of zero and Cov(ε) = σ2IT×T, while ε is a T-by-1 vector containing all disturbances. These assumptions imply that the data likelihood is

    (β,σ2|y,x)=t=1Tϕ(yt;xtβ,σ2).

    ϕ(yt;xtβ,σ2) is the Gaussian probability density with mean xtβ and variance σ2 evaluated at yt;.

Before considering the data, a joint prior distribution assumption is imposed on (β,σ2). In a Bayesian analysis, the beliefs about the distribution of the parameters are updated using information about the parameters gleaned from the likelihood of the data. The result is the joint posterior distribution of (β,σ2) or the conditional posterior distributions of the parameters.

Introduced in R2017a

Was this topic helpful?