Thread Subject: bootstrp and bootci don't work if data inputs are not same length

Subject: bootstrp and bootci don't work if data inputs are not same length

From: Alison

Date: 7 Nov, 2009 03:35:05

Message: 1 of 3

Has anyone made a workaround for this issue? bootstrp documentation does not constrain the input data to same size, but this error is thrown when passing in two vectors of different sizes (which should be fine, so long as the bootfun is designed to handle this):

??? Index exceeds matrix dimensions.

Error in ==> bootstrp at 114
      tmp = feval(bootfun,X1(onesample,:),X2(onesample,:));


Looking at bootstrp line 114, the cause is clear:

onesample (a set of n indices randomly generated over [1,n]) is generated once, and then used to index into all data vectors passed into bootstrp (ugh). However, n is defined as the max length of all varargin, so indexing into the shorter data vector(s) will throw the error.

In case anyone is interested, the bootfun takes two vectors (each of whose elements contain only 1's and 0's), and computes a ratio of (sum1/length1) / (sum2/length2).

Thank you for the help!

Subject: bootstrp and bootci don't work if data inputs are not same length

From: Peter Perkins

Date: 11 Nov, 2009 14:52:34

Message: 2 of 3

Alison wrote:
> Has anyone made a workaround for this issue? bootstrp documentation does not constrain the input data to same size

Sure it does:

>> help bootstrp
 BOOTSTRP Bootstrap statistics.
    BOOTSTAT = BOOTSTRP(NBOOT,BOOTFUN,D1,...) draws NBOOT bootstrap data
[snip]
    The third and later input arguments (D1,...) are data (scalars,
    column vectors, or matrices) that are used to create inputs to BOOTFUN.
    BOOTSTRP creates each bootstrap sample by sampling with replacement
    from the rows of the non-scalar data arguments (these must have the
    same number of rows). Scalar data are passed to BOOTFUN unchanged.


>, but this error is thrown when passing in two vectors of different sizes (which should be fine, so long as the bootfun is designed to handle this):

BOOTSTRP is designed to support the following kind of bootstrapping: The "data arguments" are thought of as a single set of data, comprising n observations on p variables (i.e., you pass in D1, D2, ... Dp, and they all have n rows). Resample with replacement from the observation, i.e., generate n random integers drawn independently from the set 1:n, and use those to index into D1, D2, ... Dp, "in parallel".

BOOTSTRP accepting scalar arguments is just a convenience, to allow "tuning parameters" to be passed into your bootstrap function. It actually pre-dates anonymous functions in MATLAB, which is what we encourage people to use these days.

> In case anyone is interested, the bootfun takes two vectors (each of whose elements contain only 1's and 0's), and computes a ratio of (sum1/length1) / (sum2/length2).

What you've described sounds equivalent to a stratified bootstrap, which is more general than what BOOTSTRP does. It should be straight-forward for you to use BOOTSTRP as a starting point for your own function that allows that generality. What you've described would presumably require _two_ resampling operations in parallel, not the one that BOOTSTRP is prepared to do.

Hope this helps.

Subject: bootstrp and bootci don't work if data inputs are not same length

From: Alison

Date: 11 Nov, 2009 18:05:22

Message: 3 of 3

Thank you, Peter. I did notice afterwards the same-length constraint. Your suggestion to run two bootstraps in parallel is helpful. I ended up writing my own bootstrapping code to create the bootstat as a ratio of results from two vecs of different lengths, and then dovetailed the sub-functions for the bootci.

Peter Perkins <Peter.Perkins@MathRemoveThisWorks.com> wrote in message <hdej3j$pm2$1@fred.mathworks.com>...
> Alison wrote:
> > Has anyone made a workaround for this issue? bootstrp documentation does not constrain the input data to same size
>
> Sure it does:
>
> >> help bootstrp
> BOOTSTRP Bootstrap statistics.
> BOOTSTAT = BOOTSTRP(NBOOT,BOOTFUN,D1,...) draws NBOOT bootstrap data
> [snip]
> The third and later input arguments (D1,...) are data (scalars,
> column vectors, or matrices) that are used to create inputs to BOOTFUN.
> BOOTSTRP creates each bootstrap sample by sampling with replacement
> from the rows of the non-scalar data arguments (these must have the
> same number of rows). Scalar data are passed to BOOTFUN unchanged.
>
>
> >, but this error is thrown when passing in two vectors of different sizes (which should be fine, so long as the bootfun is designed to handle this):
>
> BOOTSTRP is designed to support the following kind of bootstrapping: The "data arguments" are thought of as a single set of data, comprising n observations on p variables (i.e., you pass in D1, D2, ... Dp, and they all have n rows). Resample with replacement from the observation, i.e., generate n random integers drawn independently from the set 1:n, and use those to index into D1, D2, ... Dp, "in parallel".
>
> BOOTSTRP accepting scalar arguments is just a convenience, to allow "tuning parameters" to be passed into your bootstrap function. It actually pre-dates anonymous functions in MATLAB, which is what we encourage people to use these days.
>
> > In case anyone is interested, the bootfun takes two vectors (each of whose elements contain only 1's and 0's), and computes a ratio of (sum1/length1) / (sum2/length2).
>
> What you've described sounds equivalent to a stratified bootstrap, which is more general than what BOOTSTRP does. It should be straight-forward for you to use BOOTSTRP as a starting point for your own function that allows that generality. What you've described would presumably require _two_ resampling operations in parallel, not the one that BOOTSTRP is prepared to do.
>
> Hope this helps.

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
resampling Alison 6 Nov, 2009 22:39:06
bootci Alison 6 Nov, 2009 22:39:06
bootstrp Alison 6 Nov, 2009 22:39:04
rssFeed for this Thread
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com