statistics-resampling

A statistics toolbox with a variety of bootstrap and other resampling tools. (This is most recent and developmental version of the toolbox).
24 Downloads
Updated 25 May 2024

Read me

Documentation

Package maintainer

Andrew Penn (andy.c.penn@gmail.com)

Package contributors

Andrew Penn
(More contributors are welcome!)

Citations

If you use this package, please include the following citation(s):

(Note that package versions 5.4.3 and below were named the 'statistics-bootstrap' package. The 'statistics-resampling' package is a more developed version of the older 'iboot' package).

Description

The statistics-resampling package is an Octave package and Matlab toolbox that can be used to perform a wide variety of statistics tasks using non-parametric resampling methods. In particular, the functions included can be used to estimate bias, uncertainty (standard errors and confidence intervals), prediction error, and calculate p-values for null hypothesis significance tests. Variations of the resampling methods are included that improve the accuracy of the statistics for small samples and samples with complex dependence structures.

Using the statistics-resampling package online

Try out the statistics-resampling package in your browser at statistics-resampling-online : a ready-to-go implementation of statistics-resampling in a JupyterLab Notebook with an Octave kernel. Note that the first time (since the last repository commit) that you use statistics-resampling online with Binder you can expect it to take a while to build a docker image, but subsequent access to statistics-resampling-online will take less than a minute or so.

Collaborative student projects in GNU Octave can use the statistics-resampling package at Octave-Online. Doing so requires users to download the latest release of the Source code (tar.gz) from here and follow steps 2-5 of these instructions.

Users who prefer Jupyter and have a workflow that is collaborative and/or crosses over multiple programming languages may find it more convenient to install and use the statistics-resampling package at COCALC. The approach described above (for Octave-Online) also applies to installing the statistics-resampling package via a Jupyter Notebook with an Octave kernel at COCALC.

Alternatively, if you have an account with MATLAB you can try out the statistics-resampling package at Matlab-Online Open in MATLAB Online by either following the local installation instructions below, or by adding the toolbox (not collection) version of the package via Apps >> Get More Apps.

Follow the links in the 'Quick start' section below to obtain some examples of data and code to try out with the package.

Requirements and dependencies

Users with greater computational demands may want to consider installing and running the statistics-resampling package offline. Installation of the statistics-resampling package has some software requirements. The core functions in this package require, and are known to be compatible with, GNU Octave (version >= 4.4.0) and Matlab (version >= R2007a 7.4.0). Some optional features of this package have further dependencies:

Installation

To install (or test) the statistics-resampling package in your computer at a location of your choice, for either Matlab or Octave, follow these steps:

  • Download the latest package release from here. Extract (not just browse) the contents of the compressed file (.zip or .tar.gz), and move the package directory to the desired location.
  • Open Octave or Matlab (command prompt).
  • Change directory (cd) into the package folder. (The directory contains a file called 'make.m' and 'install.m', among others)
  • Type make to compile the MEX files from source (or use the precompiled binaries if available. If suitable precompiled binaries are not available for your platform, then Matlab/Octave will need access to a C++11 compiler. Note that if you skip the make step, then the package functions will still work, but some will run slower. This step is interactive so check the command window.)
  • Type install. The package will load now (and automatically in the future) when you start Octave/Matlab.

If you want or need to uninstall the package, change directory (cd) into the package folder and type uninstall.

Alternatively, Octave users can install the latest release of the package just like any other Octave package by typing:

pkg install -forge statistics-resampling

Or for the most recent developmental version of the package:

pkg install "https://github.com/gnu-octave/statistics-resampling/archive/refs/heads/master.zip"

The package can then be loaded on demand in Octave with the following commmand:

pkg load statistics-resampling

(Note that this isn't necessary if you used the local installation instructions first described in this section)

Alternatively, MATLAB users can conveniently install the package functions as a toolbox by double-clicking the 'statistics-resampling.mltbx' file in the matlab subdirectory. The toolbox installed in this way can be disabled or uninstalled via MATLAB's Add-On manager. MEX files are included with the toolbox installation for Windows (32- or 64-bit), MacOS (Intel or Apple Silicon 64-bit) and Linux (64-bit).

Usage

All help and demos are documented on the 'Function Reference' page in the manual. If you do not see the navigation pane on the manual web pages, please enable javascript in your browser. If you need further help with using any of the functions in this package, please post your questions on the discussions page.

Function help can also be requested directly from the Octave/MATLAB command prompt, by typing help function-name - substituting in the actual function name.

In Octave only, you can get a basic overview of the package and it's functions by typing: pkg describe -verbose statistics-resampling, or request demonstrations of function usage by typing demo function-name. Users can also request help with using functions and programming in Octave at the discourse group.

TIPS: You can now document and publish your statistics-resampling analysis in Jupyter Notebooks (with an Octave kernel) at your GitHub repository using the nbgitpuller link generator and the statistics-resampling-online Binder environment . Alternatively, you could fork the repository or use it as a template for you own GitHub repository. Using Jupyter notebooks, you can also integrate use of the statistics-resampling package into your analysis workflow alongside other programming languages including Python, R and Julia

Quick start

Below are links to demonstrations of how the bootstrap or randomization functions from this package can be used to perform variants of some commonly used statistical tests, but without the Normality assumption:

For examples of how to import data sets from a human-readable text file, like a tab-separated-value (TSV) and comma-separated-value (CSV) file, see the examples in the JupyterLab Notebook at statistics-resampling-online and the last demonstration listed on this page

Issues

If you find bugs or have any suggestions, please raise an issue on GitHub here. If you have any problems specifically with Binder for statistics-resampling online, please raise an issue on GitHub here. Please make sure that, when reporting a bug, you provide as much information as possible for other users to be able to replicate it.

Package: statistics-resampling

Cite As

Penn, Andrew Charles. Resampling Methods for Small Samples or Samples with Complex Dependence Structures [Https://Github.com/Gnu-Octave/Statistics-Resampling/]. Zenodo, 2020, doi:10.5281/ZENODO.3992392.

View more styles
MATLAB Release Compatibility
Created with R2015b
Compatible with R2007a and later releases
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Versions that use the GitHub default branch cannot be downloaded

Version Published Release Notes
5.5.17

Enhancement to the standard error calculations in `bootwild` and the Cohen's d effect sizes in `bootlm` functions when the method is 'wild'.

5.5.16

Fixed bug in `bootstrp` that prevented it from reshaping output (bootstat) correctly when applying vectorized evaluation of bootfun on the data

5.5.15

Fixed syntax error that triggered error when printing output for asymmetric bootstrap-t confidence intervals

5.5.14

Added option to compute bias-corrected confidence intervals with the `bootint` function.

5.5.13

Improved numerical accuracy when solving linear systems in bootlm, bootwild, and bootbayes. Corrected dates that files were modified and updated matlab toolbox and online manual.

5.5.11

Minor update which includes modification to the calculation of influence values (used for BCa intervals) such that they are more consistent with the 'empinf' function in the R 'boot' package

5.5.10

Add check in 'install.m' that issues warning if 'make.m' was not run first. Minor changes to `randtest` and `randtest2` settings that decides when to use exact vs approx permutations. Added extra output argument 'stats' to `bootstrp` function.

5.5.9

Major revamp and improvements to `bootstrp` function. Added new functions `randtest`, `randtest1` and `bootint`. Performance enhancements in `boot`, `bootclust` and `bootknife` functions. Bug fixes in `sampszcalc`. Feature enhancements to `cor`.

5.5.8

Added ability in `bootclust` function to accelerate block (or cluster) bootstrap function evaluations by parallel computing. Added to `bootclust` function a third output argument BOOTDATA - a cell array of data resamples.

5.5.7

Minor changes to documentation and comments.

5.5.6

Added pre-compiled MATLAB mex files for platforms with Apple Silicon processors (boot, smoothmedian), better support for NaN and Inf values (bootlm, bootbayes, bootwild), performance enhancements (bootlm), functionality to calculate standardized para

5.5.5.2

Added support in bootlm function for NaN or Inf (missing values) in categorical predictors. Minor bug fix when executing bootlm with missing values, when the latter causes bootlm to exclude a whole level of a categorical predictor.

5.5.5.0

- Added support for logical (true, false) in GROUP in `bootlm`
- Improved handling of data input (e.g. row vectors and matrices) in `bootlm`
- Added ability to customize posthoc comparisons (with demos added to illustrate too) in `bootlm`

To view or report issues in this GitHub add-on, visit the GitHub Repository.
To view or report issues in this GitHub add-on, visit the GitHub Repository.