Thread Subject: randsample bug?

Subject: randsample bug?

From: MZ

Date: 2 Jul, 2009 18:27:13

Message: 1 of 4

I may have found a bug in randsample. Can anyone suggest a workaround?

I'm calling it with:

randsample(vals,10^5,true,w);

vals is the data set I'm sampling from. w is a weighting function.
Some of the elements of w are equal to zero.

Occasionally I get an error:

"??? Subscript indices must either be real positive integers or
logicals."

And it points to the line of code in randsample.m that reads:

y = population(y);

I checked, and apparently y is equal to 0 for one or two of the 10^5
values in y. Where does y come from?

[dum, y] = histc(rand(k,1),[0 cumsum(p)]);

where p is the normalized weighting array.

That means that histc is returning 0 in y every so often, indicating
that the rand function is out of range. But rand is supposed to return
(0,1), right? So that means that the cumsum(p) part is not going from 0
to 1? In fact, when I ask it if max(cumsum(p))>1, it returns true.
Seems the opposite of what I was expecting.

So, I'm completely confused now.

Subject: randsample bug?

From: Peter Perkins

Date: 2 Jul, 2009 22:15:03

Message: 2 of 4

MZ, I'm not able to reproduce this, but then I don't have the weights that you're using. I'd be ahppy to look into this. Is the population small enough that you could post the weights, as format long g? Or perhaps send them directly to me? The population itself doesn't really matter.

Thanks.

- Peter Perkins
  The MathWorks, Inc.

MZ wrote:
> I may have found a bug in randsample. Can anyone suggest a workaround?
>
> I'm calling it with:
>
> randsample(vals,10^5,true,w);
>
> vals is the data set I'm sampling from. w is a weighting function. Some
> of the elements of w are equal to zero.
>
> Occasionally I get an error:
>
> "??? Subscript indices must either be real positive integers or logicals."
>
> And it points to the line of code in randsample.m that reads:
>
> y = population(y);
>
> I checked, and apparently y is equal to 0 for one or two of the 10^5
> values in y. Where does y come from?
>
> [dum, y] = histc(rand(k,1),[0 cumsum(p)]);
>
> where p is the normalized weighting array.
>
> That means that histc is returning 0 in y every so often, indicating
> that the rand function is out of range. But rand is supposed to return
> (0,1), right? So that means that the cumsum(p) part is not going from 0
> to 1? In fact, when I ask it if max(cumsum(p))>1, it returns true.
> Seems the opposite of what I was expecting.
>
> So, I'm completely confused now.

Subject: randsample bug?

From: MZ

Date: 3 Jul, 2009 20:57:42

Message: 3 of 4

Peter Perkins wrote:
> MZ, I'm not able to reproduce this, but then I don't have the weights
> that you're using. I'd be ahppy to look into this. Is the population
> small enough that you could post the weights, as format long g? Or
> perhaps send them directly to me? The population itself doesn't really
> matter.
>
> Thanks.
>
> - Peter Perkins
> The MathWorks, Inc.

Peter, my weighting function is fairly long. 2001 elements. I can send
it to you (as a .mat file?) if you give me an email address.

Anyway, I think I identified the problem.

Line 54: p = w(:)' / sum(w);

This normalizes the weighting function and puts it in p. Then, later on
(line 65?), cumsum(p) is used as a histc bin argument. One would think,
then, that max(cumsum(p))==1 should return true... but it doesn't. So
my workaround was to replace that histc line with...

cp = cumsum(p);
cp = cp / max(cp);
[dum, y] = histc(rand(k,1),[0 cp]);

I'm normalizing the cumsum(p) element. It shouldn't NEED normalization
since p was normalized. But for some reason it does. Now it works
every time.

Also, if it helps, I'm using single not double precision for everything.

Subject: randsample bug?

From: Peter Perkins

Date: 6 Jul, 2009 16:04:03

Message: 4 of 4

MZ wrote:

> > Peter, my weighting function is fairly long. 2001 elements. I can send
> > it to you (as a .mat file?) if you give me an email address.

Yes, I would be interested to see that, to confirm my suspicions. A mat file would be best, if possible. My address is as in the msg header, with obvious modification.

> > I'm normalizing the cumsum(p) element. It shouldn't NEED normalization
> > since p was normalized. But for some reason it does. Now it works
> > every time.
> >
> > Also, if it helps, I'm using single not double precision for everything.

Well, that's accumulated round-off error. Single precision makes it a whole lot easier to run into. What I'm confused about is that you said

> > In fact, when I ask it if max(cumsum(p))>1, it returns true.

and that would not seem to cause the problem you saw (perhaps you meant "<"?). In any case, thanks for pointing this out, and thanks in advance for sending those weights.

- Peter Perkins
  The MathWorks, Inc.

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
randsample Sprinceana 4 Jul, 2009 08:01:31
rssFeed for this Thread
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com