Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

estimate

Fit parameters of Bayesian linear regression model to data

Syntax

PosteriorMdl = estimate(PriorMdl,X,y)
PosteriorMdl = estimate(PriorMdl,X,y,Name,Value)
[PosteriorMdl,estBeta,EstBetaCov,estSigma2,estSigma2Var] = estimate(___)
[PosteriorMdl,estBeta,EstBetaCov,estSigma2,estSigma2Var,Summary] = estimate(___)

Description

example

PosteriorMdl = estimate(PriorMdl,X,y) returns the model that characterizes the joint posterior distributions of β and σ2 of a Bayesian linear regression model. PriorMdl specifies the joint prior distribution of the parameters and the structure of the linear regression model, X is the predictor data, and y is the response data. PriorMdl and PosteriorMdl might not be the same object type.

To produce PosteriorMdl, estimate “updates” the prior distribution with information about the parameters that it gleans from the data.

NaNs in the data indicate missing values, which estimate removes using list-wise deletion.

example

PosteriorMdl = estimate(PriorMdl,X,y,Name,Value) uses additional options specified by one or more Name,Value pair arguments. For example, you can specify a value for one of β or σ2 to estimate the conditional posterior distribution of one parameter given the specified value of the other parameter.

If you specify Beta or Sigma2, then PosteriorMdl and PriorMdl are equal.

example

[PosteriorMdl,estBeta,EstBetaCov,estSigma2,estSigma2Var] = estimate(___) returns:

  • estBeta—The estimated regression coefficients, that is, the mean vector of the posterior distribution of β.

  • EstBetaCov—The estimated covariance matrix of the coefficient estimates, that is, the covariance matrix of the posterior distribution of β.

  • EstSigma2—The estimated disturbance variance, that is, the mean of the posterior distribution of σ2.

  • EstSigma2Var—The estimated variance of the disturbance variance, that is, the variance the posterior distribution of σ2.

If you specify Beta or Sigma2, then estimate returns conditional posterior estimates. Otherwise, estimate returns joint posterior estimates.

example

[PosteriorMdl,estBeta,EstBetaCov,estSigma2,estSigma2Var,Summary] = estimate(___) uses any of the input arguments in the previous syntaxes to additionally return a table containing, for each parameter, posterior estimates, standard errors, 95% credible intervals, posterior probability that the parameter is greater than 0, and if it exists, a description of the posterior distribution.

Examples

collapse all

Load the carsmall data set. Consider a model that predicts the fuel economy (in MPG) of a car given its engine displacement and weight.

load carsmall
x = [Displacement Weight];
y = MPG;

Regress fuel economy onto engine displacement and weight including an intercept to obtain ordinary least squares (OLS) estimates.

Mdl = fitlm(x,y)
Mdl = 
Linear regression model:
    y ~ 1 + x1 + x2

Estimated Coefficients:
                    Estimate        SE         tStat       pValue  
                   __________    _________    _______    __________

    (Intercept)        46.925       2.0858     22.497    6.0509e-39
    x1              -0.014593    0.0082695    -1.7647      0.080968
    x2             -0.0068422    0.0011337    -6.0353    3.3838e-08


Number of observations: 94, Error degrees of freedom: 91
Root Mean Squared Error: 4.09
R-squared: 0.747,  Adjusted R-Squared 0.741
F-statistic vs. constant model: 134, p-value = 7.22e-28
Mdl.MSE
ans = 16.7100

Create a default, diffuse prior distribution for one predictor.

p = 2;
PriorMdl = bayeslm(p);

PriorMdl is a diffuseblm model object.

Estimate the posterior distribution using default options.

PosteriorMdl = estimate(PriorMdl,x,y);
Method: Analytic posterior distributions
Number of observations: 94
Number of predictors:   3
 
           |   Mean     Std         CI95        Positive       Distribution     
--------------------------------------------------------------------------------
 Intercept | 46.9247  2.1091  [42.782, 51.068]    1.000   t (46.92, 2.09^2, 91) 
 Beta(1)   | -0.0146  0.0084  [-0.031,  0.002]    0.040   t (-0.01, 0.01^2, 91) 
 Beta(2)   | -0.0068  0.0011  [-0.009, -0.005]    0.000   t (-0.01, 0.00^2, 91) 
 Sigma2    | 17.0855  2.5905  [12.748, 22.866]    1.000   IG(45.50, 0.0013)     
 

PosteriorMdl is a conjugateblm model object.

The posterior means and the least squares coefficient estimates and standard errors are almost identical. Also, the posterior mean of Sigma2 is close to the OLS MSE.

Consider the multiple linear regression model that predicts U.S. real gross national product (GNPR) using a linear combination of industrial production index (IPI), total employment (E), and real wages (WR).

For all , is a series of independent Gaussian disturbances with a mean of 0 and variance . Assume that the prior distributions are:

  • is a 4-D t distribution with 30 degrees of freedom for each component, correlation matrix C, location ct, and scale st.

  • , with shape and scale .%

bayeslm treats these assumptions and the data likelihood as if the corresponding posterior is analytically intractable.

Declare a MATLAB® function that:

  • Accepts values of and together in a column vector, and values of the hyperparameters.

  • Returns the value of the joint prior distribution, , given the values of and .

function logPDF = priorMVTIG(params,ct,st,dof,C,a,b)
%priorMVTIG Log density of multivariate t times inverse gamma 
%   priorMVTIG passes params(1:end-1) to the multivariate t density
%   function with dof degrees of freedom for each component and positive
%   definite correlation matrix C. priorMVTIG returns the log of the product of
%   the two evaluated densities.
%   
%   params: Parameter values at which the densities are evaluated, an
%           m-by-1 numeric vector.
%
%       ct: Multivariate t distribution component centers, an (m-1)-by-1
%           numeric vector.  Elements correspond to the first m-1 elements
%           of params.
%
%       st: Multivariate t distribution component scales, an (m-1)-by-1
%           numeric (m-1)-by-1 numeric vector.  Elements correspond to the
%           first m-1 elements of params.
%
%      dof: Degrees of freedom for the multivariate t distribution, a
%           numeric scalar or (m-1)-by-1 numeric vector. priorMVTIG expands
%           scalars such that dof = dof*ones(m-1,1). Elements of dof
%           correspond to the elements of params(1:end-1).
%
%        C: correlation matrix for the multivariate t distribution, an
%           (m-1)-by-(m-1) symmetric, positive definite matrix. Rows and
%           columns correspond to the elements of params(1:end-1).
% 
%        a: Inverse gamma shape parameter, a positive numeric scalar.
%
%        b: Inverse gamma scale parameter, a positive scalar.
%
beta = params(1:(end-1));
sigma2 = params(end);

tVal = (beta - ct)./st;
mvtDensity = mvtpdf(tVal,C,dof);
igDensity = sigma2^(-a-1)*exp(-1/(sigma2*b))/(gamma(a)*b^a);

logPDF = log(mvtDensity*igDensity);
end


Create an anonymous function that operates like priorMVTIG, but accepts the parameter values only and holds the hyperparameter values fixed to arbitrarily chosen values.

rng(1); % For reproducibility
dof = 30;
V = rand(4,4);
Sigma = V'*V;
st = sqrt(diag(Sigma));
C = Sigma./(st*st');
ct = -10*rand(4,1);
a = 10*rand;
b = 10*rand;
logPDF = @(params)priorMVTIG(params,ct,st,dof,C,a,b);

Create a custom joint prior model for the linear regression parameters. Specify the number of predictors, p. Also, specify the function handle for priorMVTIG and the variable names.

p = 3;
PriorMdl = bayeslm(p,'ModelType','custom','LogPDF',logPDF,...
    'VarNames',["IPI" "E" "WR"]);

PriorMdl is a customblm Bayesian linear regression model object representing the prior distribution of the regression coefficients and disturbance variance.

Load the Nelson-Plosser data set. Create variables for the response and predictor series.

load Data_NelsonPlosser
X = DataTable{:,PriorMdl.VarNames(2:end)};
y = DataTable{:,'GNPR'};

Estimate the marginal posterior distributions of and using the Hamiltonian Monte Carlo (HMC) sampler. Specify drawing 10000 samples and a burn-in period of 1000 draws.

PosteriorMdl = estimate(PriorMdl,X,y,'Sampler','hmc','NumDraws',1e4,...
    'Burnin',1e3);
Method: MCMC sampling with 10000 draws
Number of observations: 62
Number of predictors:   4
 
           |   Mean      Std          CI95         Positive  Distribution 
--------------------------------------------------------------------------
 Intercept | -23.8627  3.1511  [-30.065, -17.459]    0.000     Empirical  
 IPI       |   4.3810  0.1088   [ 4.166,  4.594]     1.000     Empirical  
 E         |   0.0011  0.0002   [ 0.001,  0.002]     1.000     Empirical  
 WR        |   2.5026  0.3360   [ 1.846,  3.184]     1.000     Empirical  
 Sigma2    |  40.0057  6.8285   [28.631, 55.557]     1.000     Empirical  
 

PosteriorMdl is an empiricalblm model object storing the draws from the posterior distributions.

View a trace plot and ACF plot of the draws from the posterior of, for example, and the disturbance variance. Do not plot the burn-in period.

figure;
subplot(2,1,1)
plot(PosteriorMdl.BetaDraws(2,1001:end));
title('Trace plot - \beta_1');
xlabel('MCMC draw')
ylabel('Simulation index')
subplot(2,1,2)
autocorr(PosteriorMdl.BetaDraws(2,1001:end))

figure;
subplot(2,1,1)
plot(PosteriorMdl.Sigma2Draws(1001:end));
title('Trace plot - Disturbance Variance');
xlabel('MCMC draw')
ylabel('Simulation index')
subplot(2,1,2)
autocorr(PosteriorMdl.Sigma2Draws(1001:end))

The MCMC sample of the disturbance variance appears to be mixing well.

This example uses the same data and context as Estimate Posterior Using Hamiltonian Monte Carlo Sampler, but assumes a diffuse prior model instead.

Create a diffuse prior model for the linear regression parameters. Specify the number of predictors, p, and the names of the regression coefficients.

p = 3;
PriorMdl = bayeslm(p,'ModelType','diffuse','VarNames',["IPI" "E" "WR"])
PriorMdl = 

 
           | Mean  Std        CI95        Positive        Distribution       
-----------------------------------------------------------------------------
 Intercept |  0    Inf  [   NaN,    NaN]    0.500   Proportional to one      
 IPI       |  0    Inf  [   NaN,    NaN]    0.500   Proportional to one      
 E         |  0    Inf  [   NaN,    NaN]    0.500   Proportional to one      
 WR        |  0    Inf  [   NaN,    NaN]    0.500   Proportional to one      
 Sigma2    |  Inf  Inf  [   NaN,    NaN]    1.000   Proportional to 1/Sigma2 
 

PriorMdl is a diffuseblm model object.

Load the Nelson-Plosser data set. Create variables for the response and predictor series.

load Data_NelsonPlosser
X = DataTable{:,PriorMdl.VarNames(2:end)};
y = DataTable{:,'GNPR'};

Estimate the conditional posterior distributions of given that and the data.

[Mdl,condPostMeanBeta,CondPostCovBeta] = estimate(PriorMdl,X,y,...
    'Sigma2',2);
Method: Analytic posterior distributions
Conditional variable: Sigma2 fixed at   2
Number of observations: 62
Number of predictors:   4
 
           |   Mean      Std          CI95         Positive     Distribution    
--------------------------------------------------------------------------------
 Intercept | -24.2536  1.8696  [-27.918, -20.589]    0.000   N (-24.25, 1.87^2) 
 IPI       |   4.3913  0.0301   [ 4.332,  4.450]     1.000   N (4.39, 0.03^2)   
 E         |   0.0011  0.0001   [ 0.001,  0.001]     1.000   N (0.00, 0.00^2)   
 WR        |   2.4682  0.0743   [ 2.323,  2.614]     1.000   N (2.47, 0.07^2)   
 Sigma2    |    2       0       [ 2.000,  2.000]     1.000   Fixed value        
 

estimate returns the 4-by-1 vector of means and the 4-by-4 covariance matrix of the conditional posterior distribution of given and the data in condPostMeanBeta and CondPostCovBeta, respectively. Also, estimate displays a summary of the conditional posterior distribution of . Because it was fixed during estimation, inferences on are trivial.

Display Mdl.

Mdl
Mdl = 

 
           | Mean  Std        CI95        Positive        Distribution       
-----------------------------------------------------------------------------
 Intercept |  0    Inf  [   NaN,    NaN]    0.500   Proportional to one      
 IPI       |  0    Inf  [   NaN,    NaN]    0.500   Proportional to one      
 E         |  0    Inf  [   NaN,    NaN]    0.500   Proportional to one      
 WR        |  0    Inf  [   NaN,    NaN]    0.500   Proportional to one      
 Sigma2    |  Inf  Inf  [   NaN,    NaN]    1.000   Proportional to 1/Sigma2 
 

Because estimate computes the conditional posterior distribution, it returns the original prior model, not the posterior, in the first position of the output argument list.

Estimate the conditional posterior distributions of given that is condPostMeanBeta.

[~,~,~,condPostMeanSigma2,condPostVarSigma2] = estimate(PriorMdl,X,y,...
    'Beta',condPostMeanBeta);
Method: Analytic posterior distributions
Conditional variable: Beta fixed at -24.2536       4.3913   0.00112035      2.46823
Number of observations: 62
Number of predictors:   4
 
           |   Mean      Std          CI95         Positive     Distribution    
--------------------------------------------------------------------------------
 Intercept | -24.2536   0      [-24.254, -24.254]    0.000   Fixed value        
 IPI       |   4.3913   0       [ 4.391,  4.391]     1.000   Fixed value        
 E         |   0.0011   0       [ 0.001,  0.001]     1.000   Fixed value        
 WR        |   2.4682   0       [ 2.468,  2.468]     1.000   Fixed value        
 Sigma2    |  48.5138  9.0088   [33.984, 69.098]     1.000   IG(31.00, 0.00069) 
 

estimate returns the mean and variance of the conditional posterior distribution of given is condPostMeanBeta and the data in condPostMeanSigma2 and CondPostVarSigma2, respectively. In the display, inferences on are trivial.

This example uses the same data and context as Estimate Posterior Using Hamiltonian Monte Carlo Sampler, but assumes a semiconjugate prior model instead.

Create a semiconjugate prior model for the linear regression parameters. Specify the number of predictors, p, and the names of the regression coefficients.

p = 3;
PriorMdl = bayeslm(p,'ModelType','semiconjugate',...
    'VarNames',["IPI" "E" "WR"]);

PriorMdl is a semiconjugateblm model object.

Load the Nelson-Plosser data set. Create variables for the response and predictor series.

load Data_NelsonPlosser
X = DataTable{:,PriorMdl.VarNames(2:end)};
y = DataTable{:,'GNPR'};

Estimate the marginal posterior distributions of and . Return the estimation summary table.

rng(1); % For reproducibility
[PosteriorMdl,~,~,~,~,Summary] = estimate(PriorMdl,X,y);
Method: Gibbs sampling with 10000 draws
Number of observations: 62
Number of predictors:   4
 
           |   Mean      Std          CI95        Positive  Distribution 
-------------------------------------------------------------------------
 Intercept | -23.9922  9.0520  [-41.734, -6.198]    0.005     Empirical  
 IPI       |   4.3929  0.1458   [ 4.101,  4.678]    1.000     Empirical  
 E         |   0.0011  0.0003   [ 0.000,  0.002]    0.999     Empirical  
 WR        |   2.4711  0.3576   [ 1.762,  3.178]    1.000     Empirical  
 Sigma2    |  46.7474  8.4550   [33.099, 66.126]    1.000     Empirical  
 

PosteriorMdl is an empiricalblm model object because marginal posterior distributions semiconjugate models are analytically intractable, so estimate must implement a Gibbs sampler. Summary is a table containing the estimates and inferences that estimate displays at the command line.

Display the summary table.

Summary
Summary =

  5x5 table

                   Mean          Std                  CI95              Positive    Distribution
                 _________    __________    ________________________    ________    ____________

    Intercept      -23.992         9.052       -41.734       -6.1976    0.0053      'Empirical' 
    IPI             4.3929       0.14578        4.1011        4.6782         1      'Empirical' 
    E            0.0011124    0.00033976    0.00045128     0.0017883    0.9989      'Empirical' 
    WR              2.4711        0.3576        1.7622        3.1781         1      'Empirical' 
    Sigma2          46.747         8.455        33.099        66.126         1      'Empirical' 

Access the 95% equi-tailed credible interval of the regression coefficient of IPI.

Summary.CI95(2,:)
ans =

    4.1011    4.6782

Input Arguments

collapse all

Bayesian linear regression model usually representing a prior model, specified as an object in this table.

ValueBayesian Linear Regression Model Description
conjugateblm model objectDependent, normal-inverse-gamma conjugate model returned by bayeslm or estimate
semiconjugateblm model objectIndependent, normal-inverse-gamma semiconjugate model returned by bayeslm
diffuseblm model objectDiffuse prior model returned by bayeslm
empiricalblm model objectSamples from prior distributions characterize prior model returned by bayeslm or estimate
customblm model objectPrior distribution function that you declare returned by bayeslm

PriorMdl can also represent a joint posterior model returned by estimate, that is, either a conjugateblm or empiricalblm model object. In this case, estimate “updates” the joint posterior distribution using the new observations in X and y.

Predictor data for the multiple linear regression model, specified as a numObservations-by-PriorMdl.NumPredictors numeric matrix. numObservations is the number of observations, and must be equal to the length of y.

Data Types: double

Response data for the multiple linear regression model, specified as a numeric vector with numObservations elements.

Data Types: double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'Sigma2',2 specifies estimating the conditional posterior distribution of the regression coefficients given the data and that the specified disturbance variance is 2.

Options for All Prior Distributions

collapse all

Flag to display Bayesian-estimator summary to the command line, specified as the comma-separated pair consisting of 'Display' and a value in this table.

ValueDescription
trueestimate prints estimation information and a table to the command line summarizing the Bayesian estimators.
falseestimate does not print to the command line.

The estimation information includes the estimation method, any fixed parameters, number of observations, and number of predictors. The summary table contains estimated posterior means, standard errors (square root of the posterior variance), 95% equal-tailed credible intervals, the posterior probability that the parameter is greater than 0, and, if known, a description of the posterior distribution.

If you specify one of Beta or Sigma2, then estimate dispatches your specification to the display, and corresponding posterior estimates are trivial.

Example: 'Display',false

Data Types: logical

Options for All Prior Distributions Except Empirical

collapse all

Value of the regression coefficients for conditional posterior distribution estimation of the disturbance variance, specified as the comma-separated pair consisting of 'Beta' and a (PriorMdl.Intercept + PriorMdl.NumPredictors)-by-1 numeric vector. That is, estimate estimates the characteristics of π(σ2|y,X,β = Beta), where y is y, X is X, and Beta is the value of 'Beta'. If PriorMdl.Intercept is true, then Beta(1) corresponds to the model intercept. All other values correspond to the predictor variables composing the columns of X.

You cannot specify Beta and Sigma2 simultaneously.

By default, estimate does not compute characteristics of the conditional posterior of σ2.

Example: 'Beta',1:3

Data Types: double

Value of the disturbance variance for conditional posterior distribution estimation of the regression coefficients, specified as the comma-separated pair consisting of 'Sigma2' and a positive numeric scalar. That is, estimate estimates characteristics of π(β|y,X,Sigma2), where y is y, X is X, and Sigma2 is the value of 'Sigma2'.

You cannot specify Beta and Sigma2 simultaneously.

By default, estimate does not compute characteristics of the conditional posterior of β.

Example: 'Sigma2',1

Data Types: double

Options for Semiconjugate, Empirical, and Custom Prior Distributions

collapse all

Monte Carlo simulation adjusted sample size, specified as the comma-separated pair consisting of 'NumDraws' and a positive integer. estimate actually draws BurnInNumDraws*Thin samples. Hence, estimate bases the estimates off of NumDraws samples. For details on how estimate reduces the full Monte Carlo sample, see Algorithms.

If PriorMdl is a semiconjugateblm model and you specify Beta or Sigma2, then MATLAB® ignores NumDraws.

Example: 'NumDraws',1e7

Data Types: double

Options for Semiconjugate and Custom Prior Distributions

collapse all

Number of draws to remove from beginning of Monte Carlo sample to reduce transient effects, specified as the comma-separated pair consisting of 'BurnIn' and nonnegative scalar. For details on how estimate reduces the full Monte Carlo sample, see Algorithms.

Tip

To help you specify the appropriate burn-in period size, determine the extent of the transient behavior in the Monte Carlo sample by specifying 'BurnIn',0, simulating a few thousand observations using simulate, and then plotting the paths.

Example: 'BurnIn',0

Data Types: double

Monte Carlo adjusted sample size multipler, specified as the comma-separated pair consisting of 'Thin' and a positive integer.

The actual Monte Carlo sample size is BurnIn + NumDraws*Thin. After discarding the burn-in, estimate discards every Thin1 draws, and then retains the next. For details on how estimate reduces the full Monte Carlo sample, see Algorithms.

Tip

To reduce potential large serial correlation in the Monte Carlo sample or to reduce the memory consumption of the draws stored in PosteriorMdl, specify a large value of Thin.

Example: 'Thin',5

Data Types: double

Starting values of the regression coefficients for the Markov Chain Monte Carlo (MCMC) sample, specified as the comma-separated pair consisting of 'BetaStart' and a numeric column vector with (PriorMdl.Intercept + PriorMdl.NumPredictors) elements. By default, BetaStart is the ordinary least squares estimate.

Tip

It is good practice to run estimate many times using different parameter starting values. Verify that the solutions from each run converge to similar values.

Example: 'BetaStart',[1; 2; 3]

Data Types: double

Starting values of the disturbance variance for the MCMC sample, specified as the comma-separated pair consisting of 'Sigma2Start' and a positive numeric scalar. By default, Sigma2Start is the residual mean squared error of the OLS estimator.

Tip

It is good practice to run estimate many times using different parameter starting values. Verify that the solutions from each run converge to similar values.

Example: 'Sigma2Start',4

Data Types: double

Options for Custom Prior Distributions

collapse all

Reparameterize σ2 as log(σ2) during posterior estimation and simulation, specified as the comma-separated pair consisting of 'Reparameterize' and a value in this table.

ValueDescription
falseestimate does not reparameterize σ2.
trueestimate reparameterizes σ2 as log(σ2). estimate converts results back to the original scale, and does not change the functional form of PriorMdl.LogPDF.

Tip

If you experience numerical instabilities during the posterior estimation or simulation of σ2, then specify 'Reparameterize',true.

Example: 'Reparameterize',true

Data Types: logical

MCMC sampler, specified as the comma-separated pair consisting of 'Sampler' and a value in this table.

ValueDescription
'slice'Slice sampler
'metropolis'Random walk Metropolis sampler
'hmc'Hamiltonian Monte Carlo (HMC) sampler

Tip

  • To increase the quality of the MCMC draws, tune the sampler.

    1. Before calling estimate, specify the tuning parameters and their values by using sampleroptions. For example, to specify the slice sampler width width, use

      options = sampleroptions('Sampler',"slice",'Width',width);

    2. Specify the object containing the tuning-parameter specifications returned by sampleroptions by using the 'Options' name-value pair argument. For example, to use the tuning-parameter specifications in options, specify

      'Options',options

  • If you specify the HMC sampler, then a best practice is to provide the gradient for some variables, at least. estimate resorts the numerical computation of any missing partial derivatives (NaN values) in the gradient vector.

Example: 'Sampler',"hmc"

Data Types: string

Sampler options, specified as the comma-separated pair consisting of 'Options' and a structure array returned by sampleroptions. Use 'Options' to specify the MCMC sampler and its tuning parameter values.

Example: 'Options',sampleroptions('Sampler',"hmc")

Data Types: struct

Typical sampling-interval width around the current value in the marginal distributions for the slice sampler, specified as the comma-separated pair consisting of 'Width' and a positive numeric scalar or a (PriorMdl.Intercept + PriorMdl.NumPredictors + 1)-by-1 numeric vector of positive values. The first element corresponds to the model intercept, if one exists in the model. The next PriorMdl.NumPredictors elements correspond to the coefficients of the predictor variables ordered by the predictor data columns. The last element corresponds to the model variance.

  • If Width is a scalar, then estimate applies Width to all PriorMdl.NumPredictors + PriorMdl.Intercept + 1 marginal distributions.

  • If Width is a numeric vector, then estimate applies the first element to the intercept (if one exists), the next PriorMdl.NumPredictors elements to the regression coefficients corresponding to the predictor variables in X, and the last element to the disturbance variance.

  • If the sample size (size(X,1)) is less than 100, then Width is 10 by default.

  • If the sample size is at least 100, then, by default, estimate sets Width to the vector of corresponding posterior standard deviations, assuming a diffuse prior model (diffuseblm).

estimate dispatches Width to the slicesample function. For more details, see slicesample.

Tip

  • For maximum flexibility, specify the slice sampler width width by using the 'Options' name-value pair argument. For example:

    'Options',sampleroptions('Sampler',"slice",'Width',width)

  • The typical width of the slice sampler does not affect convergence of the MCMC sample. However, it does affect the number of required function evaluations, that is, the efficiency of the algorithm. If the width is too small, then the algorithm can implement an excessive number of function evaluations to determine the appropriate sampling width. If the width is too large, then the algorithm might have to decrease the width to an appropriate size, which requires function evaluations.

Example: 'Width',[100*ones(3,1);10]

Output Arguments

collapse all

Bayesian linear regression model storing distribution characteristics, returned as a conjugateblm, semiconjugateblm, diffuseblm, empiricalblm, or a customblm model object.

  • If you do not specify either of Beta or Sigma2 (that is, their values are []), then estimate updates the prior model using the data likelihood to form the posterior distribution. PosteriorMdl characterizes the posterior distribution, and its object type depends on the prior model type (PriorMdl).

    ValuePriorMdl
    conjugateblm model objectconjugateblm or diffuseblm model objects
    empiricalblm model objectsemiconjugateblm, empiricalblm, or customblm model objects

  • If you specify one of Beta or Sigma2, then PosteriorMdl equals PriorMdl (that is, they are the same object storing the same property values). In other words, estimate does not update the prior model to form the posterior model. However, estBeta, EstBetaCov, estSigma2, estSigma2Var, and Summary store conditional posterior estimates.

For details on supported posterior distributions that are analytically tractable, see Analytically Tractable Posteriors.

Estimated posterior mean of the regression coefficients, returned as a numeric column vector of size PriorMdl.Intercept + PriorMdl.NumPredictors.

If PriorMdl.Intercept is true, then estBeta(1) is the estimated intercept, and all other elements correspond to columns of X. Otherwise, the elements correspond to the columns of X.

If you do not specify Sigma2, then estBeta is the estimated mean of the marginal posterior distribution of β. Otherwise, it is the estimated mean of the conditional posterior distribution of β given Sigma2 and the data.

If you specify Beta, then estBeta equals Beta.

Estimated posterior covariance matrix of the regression coefficients, returned as a (PriorMdl.Intercept + PriorMdl.NumPredictors)-by-(PriorMdl.Intercept + PriorMdl.NumPredictors) numeric matrix. Rows and columns of EstBetaCov are commensurate with the elements of estBeta.

If you do not specify Sigma2, then EstBetaCov is the estimated covariance matrix of the marginal posterior distribution of β. Otherwise, it is the estimated covariance matrix of the conditional posterior distribution of β given σ2 = Sigma2 and the data.

If you specify Beta, then EstBetaCov is a matrix of zeros.

Estimated posterior mean of the disturbance variance, returned as a positive numeric scalar.

If you do not specify Beta, then estSigma2 is the estimated mean of the marginal posterior distribution of σ2. Otherwise, it is the estimated mean of the conditional posterior distribution of σ2 given β = Beta and the data.

If you specify Sigma2, then estSigma2 equals Sigma2.

Estimated posterior variance of the disturbance variance, returned as a numeric scalar.

If you do not specify Beta, then estSigma2Var is the estimated variance of the marginal posterior distribution of σ2. Otherwise, it is the estimated covariance matrix of the conditional posterior distribution of σ2 given β = Beta and the data.

If you specify Sigma2, then estSigma2Var is 0.

Summary of Bayesian estimators, returned as a table. Summary contains the same information as the display of the estimation summary (Display). Rows correspond to parameters and columns correspond to the:

  • Estimated posterior mean (Mean)

  • Standard error (Std)

  • 95% equal-tailed credible interval (CI95)

  • Posterior probability that the parameter is greater than 0 (Positive)

  • Description of the marginal or conditional posterior distribution of the parameter (Distribution)

Row names are the names in PriorMdl.VarNames, and the name of the last row is Sigma2.

Alternatively, pass PosteriorMdl to summarize.

Data Types: table

Limitations

If PriorMdl is an empiricalblm model object, then you cannot specify Beta or Sigma2. That is, you cannot estimate conditional posterior distributions using an empirical prior distribution.

More About

collapse all

Bayesian Linear Regression Model

A Bayesian linear regression model treats the parameters β and σ2 in the multiple linear regression (MLR) model yt = xtβ + εt as random variables.

For times t = 1,...,T:

  • yt is the observed response.

  • xt is a 1-by-(p + 1) row vector of observed values of p predictors. To accommodate a model intercept, x1t = 1 for all t.

  • β is a (p + 1)-by-1 column vector of regression coefficients corresponding to the variables composing the columns of xt.

  • εt is the random disturbance having a mean of zero and Cov(ε) = σ2IT×T, while ε is a T-by-1 vector containing all disturbances. These assumptions imply that the data likelihood is

    (β,σ2|y,x)=t=1Tϕ(yt;xtβ,σ2).

    ϕ(yt;xtβ,σ2) is the Gaussian probability density with mean xtβ and variance σ2 evaluated at yt;.

Before considering the data, a joint prior distribution assumption is imposed on (β,σ2). In a Bayesian analysis, the beliefs about the distribution of the parameters are updated using information about the parameters gleaned from the likelihood of the data. The result is the joint posterior distribution of (β,σ2) or the conditional posterior distributions of the parameters.

Tips

  • Monte Carlo simulation is subject to variation. That is, if estimate uses Monte Carlo simulation, then estimates and inferences might vary when you call estimate multiple times under seemingly equivalent conditions. To reproduce estimation results, set a random number seed using rng before calling estimate.

  • If estimate throws an error while estimating the posterior distribution using a custom prior model, then try adjusting initial parameter values using BetaStart or Sigma2Start, or try adjusting the declared log prior function and then reconstructing the model. The error can indicate that the log of the prior distribution is -Inf at the specified initial values.

Algorithms

  • Whenever the prior distribution (PriorMdl) and the data likelihood yield an analytically tractable posterior distribution, estimate evaluates the closed form solutions to Bayes estimators. Otherwise, estimate resorts to Monte Carlo simulation to estimate parameters and for inference. For more details, see Posterior Estimation and Inference.

  • This figure describes how estimate reduces the Monte Carlo sample using the values of NumDraws, Thin, and BurnIn.

    Rectangles represent successive draws from the distribution. estimate removes the white rectangles from the Monte Carlo sample. The remaining NumDraws black rectangles compose the Monte Carlo sample.

Introduced in R2017a

Was this topic helpful?