Calculate Sn (like median absolute deviation) using matlab.

on 18 Apr 2013

Andrei Bobrov (view profile)

Hi, I'm trying to compute Sn value based on : "Alternatives to the Median Absolute Deviation Peter J. Rousseeuw and Christophe Croux" paperwork.

```Sn = c*median_i{median_j|xi - xj|}
```

For each i we compute the median of {|xi-xj|; j = 1,...,n}. This yields n numbers, the median of which gives our final estimate Sn. C is constant value.

I have Y matrix of size 10000 x 2000 (i = 2000, j = 10000) Because of the size, the simplest loop method:

```for i = 1 : size(Y,2)
for j = 1 : size(Y,1)
a(j) = median( abs( Y(:,i)-Y(j,i) ) );
end
Sn(i) = c * median(a);
end
```

is time consuming, so is no good at all. I'm matlab newbie and I don't know how to use _repmat _ function which will be - as I guess - very helpful. Can I ask you for help, how to write this so that the computation time will be comparable to _mad _ function?

Products

No products are associated with this question.

Andrei Bobrov (view profile)

on 18 Apr 2013
Edited by Andrei Bobrov

Andrei Bobrov (view profile)

on 18 Apr 2013
```s = [3 1130;
4 1527;
3  907;
2  878;
4  995];
```
```k = permute(abs( bsxfun(@minus,s,permute(s,[3 2 1])) ),[1 3 2]);
out = squeeze(median(median(k)));
```

```out = zeros(1,size(s,2));
for jj = 1:size(s,2)
out(jj) = median(median(abs(bsxfun(@minus,s(:,jj),s(:,jj)'))));
end
```

```out = mad(s,1);
```

Witold

Witold (view profile)

on 18 Apr 2013

Thank you very much, that's what I was thinking about, unfortunately I got the error message:

```Error using bsxfun
Out of memory. Type HELP MEMORY for your options.
```

memory available: win 8 x64 / 8GB/ r1012b

```Maximum possible array:      6163 MB (6.462e+09 bytes) *
Memory available for all arrays:      6163 MB (6.462e+09 bytes) *
Memory used by MATLAB:      1009 MB (1.058e+09 bytes)
Physical Memory (RAM):      8157 MB (8.553e+09 bytes)
```
```*  Limited by System Memory (physical + swap file) available.
```

should I give up? or mayby try to split columns of the matrix

Andrei Bobrov

on 18 Apr 2013

Image Analyst (view profile)

on 18 Apr 2013

I think you're totally misinterpreting the formula. You don't look at the difference of all possible pairs of points - that would take forever and isn't what the formula says. Look at Wikipedia: http://en.wikipedia.org/wiki/Median_absolute_deviation For each element, you'd have a million differences. And there are a million elements so you'd have a billion values. No wonder it takes forever! The i and j DO NOT REFER TO ROW AND COLUMN. They refer to element, or "linear index" in MATLAB lingo.

I think what they mean is to take the mean over all elements (pixels) and then take the median of the differences, which is totally different. So

```% Get the median value of all elements (pixels).
medianOfWholeMatrix = median(Y(:));
% Find the difference between each element and the overall median.
differenceImage = double(Y) - medianOfWholeMatrix;
% Now take the median of that:
```

This should be substantially faster, not to mention correct.

Witold

Witold (view profile)

on 18 Apr 2013

the output (Sn) should be a vector of size 1 x n where n is num. of columns

Image Analyst

Image Analyst (view profile)

on 18 Apr 2013

Their adaption of MAD might do something different. You could do the same thing on a column-by-column basis. If you want to do that, you could just extract a column at a time and put it in a loop

```for col = 1 : size(Y, 2)
% Get the median value of all elements (pixels) in one column.
medianOfWholeColumn = median(Y(:, col));
% Find the difference between each element and the overall median.
differenceImage = double(Y(:, col)) - medianOfWholeColumn ;
% Now take the median of that:
end
```
Witold

Witold (view profile)

on 18 Apr 2013

I'm afraid my interpretation is correct, I founded an example of Sn algorithm: http://www.maplesoft.com/support/help/AddOns/view.aspx?path=Statistics/RousseeuwCrouxSn

consider M 5x2 matrix:

```M = [3 1130;
4 1527;
3  907;
2  878;
4  995]
```

for each column we calculate x(:,i) - x(i,j) and its median, for 1st column it is:

```abs(x(:,1) - x(1,1)) = [0;1;0;1;1] -> med(..) = [1]
abs(x(:,1) - x(2,1)) = [1;0;1;2;0] -> med(..) = [1]
abs(x(:,1) - x(3,1)) = [0;1;0;1;1] -> med(..) = [1]
abs(x(:,1) - x(4,1)) = [1;2;1;0;2] -> med(..) = [1]
abs(x(:,1) - x(5,1)) = [1;0;1;2;0] -> med(..) = [1]
```

now we calculate median of those medians which is 1, and we move to second column - using my simplest loops we get 117 which is correct answer.

I agree that their adaption of MAD is something different, I was just wondering if there is any method to calculate it faster... it takes up to 5 minutes to compute it for 1000 x ~2000 matrix, which is ok if worse comes to worst, but much over hours(?) for size 10k /: Anyway, thank you for help!

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi test