Rank: 5026 based on 20 downloads (last 30 days) and 1 file submitted
photo

Paul

E-mail

Personal Profile:
Professional Interests:

 

Watch this Author's files

 

Files Posted by Paul
Updated   File Tags Downloads
(last 30 days)
Comments Rating
10 Apr 2013 Outlier removal v2 Rosner's many outlier test, vectorized. Author: Paul outlier removal, outlier, rosner, probability, statistics 20 5
  • 3.0
3.0 | 1 rating
Comments and Ratings by Paul View all
Updated File Comments Rating
04 Jul 2012 Outlier removal v2 Rosner's many outlier test, vectorized. Author: Paul

Pasco:

Thanks for the advice! Somehow when I tested it with a bunch of 0s and 1s the -NaN thing seemed to work, dunno why. I've changed it at least temporarily to be -Inf. I'll look into modifying how the mean/std is done at some point, since this is clearly a placeholder.

03 Jul 2012 Outlier removal v2 Rosner's many outlier test, vectorized. Author: Paul

Pasco: Do you have an alternative way to do that? I'd be very interested to know, because I couldn't find anything good that still vectorizes the code.

The key is that I make an upper-triangular matrix there, and I need the mean and standard deviation of the upper triangular half. However, if I leave zeros in the bottom half, they are included in the mean. So the only way I saw to do it is to use nanmean and nanstd and set those zeros to NaN instead of 0.

The problem, of course, is that on the rare occasions where you have zeros in your original matrix, those will end up being set to NaN during the triangular matrix step.

Using logical indexing on the yy matrix doesn't preserve the matrix shape, and thus cannot be used. This seems like the fastest way, though I will admit that if you have -NaN in your data for some reason, it will cause problems (but honestly, wouldn't it cause problems anyway?)

Comments and Ratings on Paul's Files View all
Updated File Comment by Comments Rating
20 Jul 2013 Outlier removal v2 Rosner's many outlier test, vectorized. Author: Paul Romesh

The vectorizing ideas are nice, but I think there are two problems with the method

1. When calculating R the actual data should be used (i.e. mean and std of the data) but after you do

[ ys, is ] = sort( abs( y - ybar ));

you construct yy from ys and calculate R using nanmean and nanstd from yy, which is being applied to the magnitude of the difference between the original data and the mean rather than the data itself. You probably need something 'ys = y(is)' before constructing yy

2. GESD rejects at outliers all of the points up to the last R_i > lambda_i so if you have Ri > lambda_i being [1 1 0 1 0 0] you need to reject 4 points, not 3

Lastly, the messing around with -Inf is really ugly. You should use tril to get an array corresponding to the lower entries which you can then insert the NaNs from. For example,

yy = rot90(triu(repmat(ys, 1, k), k-n), 2)' + tril(nan(size(yy)),-1);

04 Jul 2012 Outlier removal v2 Rosner's many outlier test, vectorized. Author: Paul Paul

Pasco:

Thanks for the advice! Somehow when I tested it with a bunch of 0s and 1s the -NaN thing seemed to work, dunno why. I've changed it at least temporarily to be -Inf. I'll look into modifying how the mean/std is done at some point, since this is clearly a placeholder.

04 Jul 2012 Outlier removal v2 Rosner's many outlier test, vectorized. Author: Paul Alquim, Pasco

Paul,

I didn't study your program so that I can propose alternatives but my comment goes to the fact that

1- There is no such a thing as a "-NaN". NaNs are not positive or negative. They are just ... NaN

2- yy(yy == -NaN) = 0; will never work because by definition all numbers are different from a NaN, including the NaN itself (just try (NaN == NaN)). But you can get what you want with

yy(isnan(yy)) = 0;

As regarding the usage of NaN instead of zeros, well perhaps you can

1- count them and remove them temporarily.
2- compute the means and stand deviation taking into account the true number of non-zero values (that's for sure what nanmean is doing when it ignores the NaNs)

03 Jul 2012 Outlier removal v2 Rosner's many outlier test, vectorized. Author: Paul Paul

Pasco: Do you have an alternative way to do that? I'd be very interested to know, because I couldn't find anything good that still vectorizes the code.

The key is that I make an upper-triangular matrix there, and I need the mean and standard deviation of the upper triangular half. However, if I leave zeros in the bottom half, they are included in the mean. So the only way I saw to do it is to use nanmean and nanstd and set those zeros to NaN instead of 0.

The problem, of course, is that on the rare occasions where you have zeros in your original matrix, those will end up being set to NaN during the triangular matrix step.

Using logical indexing on the yy matrix doesn't preserve the matrix shape, and thus cannot be used. This seems like the fastest way, though I will admit that if you have -NaN in your data for some reason, it will cause problems (but honestly, wouldn't it cause problems anyway?)

03 Jul 2012 Outlier removal v2 Rosner's many outlier test, vectorized. Author: Paul Alquim, Pasco

A bit difficult to trust in a file that does things like:

ys(ys == 0) = -NaN;

or

yy(yy == -NaN) = 0;

Contact us