File Exchange

image thumbnail

Bayesian penalized regression with continuous shrinkage prior densities

version 1.6 (75.5 KB) by

Bayesian g-prior, LASSO, horseshoe and horseshoe+ linear and logistic regression

14 Downloads

Updated

View License

This is a comprehensive, user-friendly toolbox implementing the state-of-the-art in Bayesian linear regression and Bayesian logistic regression. The toolbox provides highly efficient and numerically stable implementations of ridge, lasso, horseshoe, horseshoe+ and g-prior regression. The lasso, horseshoe and horseshoe+ priors are recommended for data sets where the number of predictors is greater than the sample size. The toolbox allows predictors to be assigned to logical groupings (potentially overlapping, so that predictors can be part of multiple groups). This can be used to exploit a priori knowledge regarding predictors and how they may be related to each other (for example, in grouping genetic data into genes and collections of genes such as pathways).

To support analysis of data with outliers, we provide two heavy-tailed error models in our implementation of Bayesian linear regression: Laplace and Student-t distribution errors. Most features are straightforward to use and the toolbox can work directly with MATLAB tables (including automatically handling categorical variables), or you can use standard MATLAB matrices.

The toolbox is very efficient and can be used with high-dimensional data. Please see the scripts in the directory "examples\" for examples on how to use the toolbox, or type "help bayesreg" within MATLAB. An R version of this toolbox is now available on CRAN. To install the R package, type "install.packages("bayesreg")" within R.

To cite this toolbox:
Makalic E. & Schmidt, D. F.
High-Dimensional Bayesian Regularised Regression with the BayesReg Package
arXiv:1611.06649 [stat.CO], 2016

PLEASE NOTE:
The package now handles logistic regression without the need for MEX files, but big speed-ups can be obtained when using compiled code, so this is recommended. To compile the C++ code, run compile.m from the bayesreg directory within MATLAB; compilation requires the MS Visual Studio Professional or the GNU g++ compiler. Alternatively, for convenience, the pre-compiled MEX files (MATLAB R2017a) for Windows, Linux and Mac OSX can be downloaded from the following two URLs:

http://www.emakalic.org/blog/
http://dschmidt.org/?page_id=189

To use these, all you need to do is download them and unzip into the "bayesreg" folder.

Comments and Ratings (7)

Yang Wang

Fantastic job. Thanks guys!

Mike Wong

Steven

Steven (view profile)

Great program. Very useful for my work.

Statovic

Statovic (view profile)

Hi Gary,

We have just finished and uploaded Version 1.3 of the software. This has support for MATLAB tables, handles categorical variables appropriately and has a prediction function that can be used to produce predictions, prediction credible intervals and calculate prediction performance statistics. I hope you find it useful.

Cheers,
Daniel

GARY CHAO

Hi Daniel,
Thanks so much for your answer!
I have implemented the toolbox in MATLAB and found the results correspond well with the traditional ridge/lasso regularization results when p<n, meaning the situations when the number of predictors is smaller than the number of observations.
Hope that the new version comes soon and better!
Thanks and best regards,
Gary

Statovic

Statovic (view profile)

Hi Gary.

The fully Bayesian approach used in this tool selects the regularisation parameters automatically by including it in the Bayesian hierarchy and sampling along with the model parameters. The current version implements a "half-Cauchy" prior on the overall regularisation parameter, in accordance with suggestions from Polson and others.

The "best" posterior regression coefficients, in terms of squared-prediction error, are given by retval.muB. We are just finishing a version which provides a "predict" function to compute predictions onto new data (or the training data, if you want) and calculates prediction performance statistics. It also allows you to predict using the full Bayesian predictive posterior distribution, accepts Matlab tables and handles categorical variables. Hopefully this will be released in the next few days.

Cheers and thanks for your interest,
Daniel

GARY CHAO

Hi, thanks for your contribution to Bayesian regularization problem!
I have a question on this toolbox:
As we often choose the proper coefficients for the regularization term to obtain the most suitable prediction results through cross-validation when using Ridge/Lasso regularization method, is there a similar process in this toolbox? For example, how can we set the values for the error distributions in this toolbox to obtain different prior distributions? I have this question since I found that the results using this toolbox are different from the results obtained directly by ridge/lasso function in matlab.
By the way, how can we obtain the "best" posterior regression coefficients? Can we regard the results in retval.muB as the "best" results?
Thanks in advance for your help!

Updates

1.6

-Display the Widely Applicable Akaike's Information Criterion (WAIC) instead of DIC in summary output
-Implemented block sampling of betas for data with large numbers of predictors (options 'blocksample and 'blocksize')

1.5

- written a new parallelised C++ implementation of sampling code for logistic regression
- efficient MATLAB implementation of logistic regression sampling included; works even when MEX files are not available but not as fast

1.4

- Added option ‘groups’ which allows grouping of variables into potentially overlapping groups
- Grouping works with HS, HS+ and lasso
- Fixed a bug with g priors and logistic models
- Updated examples to demonstrate grouping and toolbox description

1.3

- Tidied up the summary display
- Added support for MATLAB tables
- Added support for categorical predictors
- Added a prediction function
- Updated and improved the example scripts
- Fix bug in computation of R2

1.2

Version 1.2
-This version implements Zellner's g-prior for linear and logistic regression. The g-prior only works with full rank matrices. The examples in "examples_bayesreg.m" have been updated to include a g-prior example.

1.1

Version 1.1
-Moved all display code to a separate function called "summary()". Now the summary table can be produced on demand after sampling.
-Updated "examples_bayesreg.m" to include examples of the new "summary()" command.

1.0

Updated description to include links to the full version of the toolbox.

MATLAB Release
MATLAB 9.0 (R2016a)

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video

Win prizes and improve your MATLAB skills

Play today

bayesreg/

bayesreg/examples/