Asked by James
on 21 Sep 2011

I have a matrix of values, for example: [0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 1 0 0 ...]

In practice this is a vector (1 row by somewhere around 80000 columns). I need to output a vector with the duration of these 'bursts' of 1's. For example, the output for the above would be: [3 2 5 ...]

Background: I have a time series where all the 1's are values above a threshold and all the 0's are values below a threshold. So by doing this I'm trying to investigate the duration of these extreme events.

Can anyone tell me how I can do this?

*No products are associated with this question.*

Answer by Image Analyst
on 21 Sep 2011

Accepted answer

If you have the Image Processing Toolbox, just call bwconncomp() or bwlabel(), then call regionprops() asking for 'Area' and then extract the areas from the structure. Three lines.

labeledMatrix = bwlabel(data); measurements = regionprops(labeledMatrix, 'Area'); allAreas = [measurements.Area];

Answer by Derek O'Connor
on 21 Sep 2011

If you don't have any toolboxes then this plain Matlab function may help. It is based loosely on the WordCount procedure in Kernighan & Plauger [1981], *Software Tools in Pascal*, ©Bell Labs, Addison-Wesley, page 17.

For `n = 10^8` it takes about 4 secs (single core) 2.3GHz, R2008a 64bit on Windows 7.

function [start, len, k1] = ZeroOnesCount(v) % % Determine the position and length of each 1-string in v, % where v is a vector of 1s and 0s % Derek O'Connor 21 Sep 2011 % n = length(v); start = zeros(1,n); % where each 1-string starts len = zeros(1,n); % length of each 1-string k1= 0; % count of 1-strings inOnes = false; for k = 1:n if v(k) == 0 % not in 1-string inOnes = false; elseif ~inOnes % start of new 1-string inOnes = true; k1 = k1+1; start(k1) = k; len(k1) = 1; else % still in 1-string len(k1) = len(k1)+1; end end %------------- ZeroOnesCount ----------------------------

>> v = [0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 1 0 0]; >> [start,len,k1] = ZeroOnesCount(v);disp([start(1:k1); len(1:k1)]); 7 15 21 3 2 5

Answer by Derek O'Connor
on 22 Sep 2011

This is a more general and simpler solution than my previous answer.

%------------------------------------------------------- function [start, val, len, rn] = RunsCount(v) % % Determine the length of each val-run in v, % where v is a vector of 'values' % Example 1: RunsCount([2 2 2 1 1 9 8 8 3 3 3 3]) gives % start 1 4 6 7 9 % val 2 1 9 8 3 % len 3 2 1 2 4 % Example 2: RunsCount(['aaaabbaacccbbb']); % start 1 5 7 9 12 % val 97 98 97 99 98 % len 4 2 2 3 3 % % Derek O'Connor 22 Sep 2011 % n = length(v); val = zeros(1,n); % run value len = zeros(1,n); % run length start = zeros(1,n); % pos. in v where run starts start(1) = 1; rk = 1; % number in run rn = 1; % number of runs for k = 2:n if v(k) == v(k-1) % in run rk = rk+1; else % end of run val(rn) = v(k-1); len(rn) = rk; rk = 1; % v(k) is start of rn = rn+1; % next run start(rn) = k; % position of start in v end end val(rn) = v(n); % last run len(rn) = n - sum(len); % %------------- End RunsCount ----------------------------

**EDIT** : Pre-allocated all output arrays. Changed loop counter from `i` to `k`.

Answer by Jan Simon
on 22 Sep 2011

A modified version of Derek's solution:

function len = CountOnes(v) n = length(v); len = zeros(1,n); % length of each 1-string k1 = 0; % count of 1-strings inOnes = false; s = 0;

for k = 1:n if v(k) % not in 1-string s = s + 1; % increase the counter inOnes = true; elseif inOnes % leave the ones block k1 = k1 + 1; len(k1) = s; s = 0; % reset the counter inOnes = false; end end len = len(1:k1);

And this is a vectorized approach:

function len = CountOnes_vectorized(v) v = diff([0, x, 0]); len = findstr(v, -1) - findstr(v, 1);

It looks nice, but is 45% slower than the loop method for the 80'000 elements needed by the OP.

**[EDITED]** New version with a nested loop:

function len = CountOnes2(v) n = length(v); len = zeros(1, ceil(n/2), 'uint32'); j = 0;

k = 1; while k <= n if v(k) a = k; k = k + 1; while k <= n && v(k) k = k + 1; end j = j + 1; len(j) = k - a; end k = k + 1; end len = len(1:j);

Leendert
on 12 Mar 2014

Hi Jan,

I would like to ask you some 'dummy' questions about your function. Hope you have the time to answer them. I see the function works fine, but I can't figure out how it works! :) And I would like to rebuild it to count each run of consecutive zeros in an array.

- What does " if v(k) " and "&& v(k)" test for? I only know if structure with operators like >,<, == and so on.
- You have used a while loop. In what way is this better than a for loop?

It would be great if your could post the function with (extensive/dummy understandable) comments on what each line does.

Hope to learn from you!

Thanx in advance

Answer by Derek O'Connor
on 22 Sep 2011

Jan,

I'm using the Add-an-Answer window because the comment window is hard to use for all but short, text-only replies.

I posted an older version of `RunsCount` (no pre-allocation) by mistake. At that time I wasn't sure how long the output arrays should be. The size varies with the input but the worst is `n`, so pre-allocate the output arrays of length `n`.

I take your point that the array start, etc., are not necessary, but I suspect this information might be needed in other applications of these counters. Anyway, to compare your modification with `ZeroOnesCount` and `RunsCount`, I stripped out all but the output array `len(1,n)`.

`RC3` is `RunsCount` with `start` and `val` array and their ops stripped out

`ZOC2` is `ZeroOnesCount` with `start` array and its ops stripped out.

`COJ` is `CountOnes`, Jan's modification of `ZOCorig`

The table below are the normalized average times, averaged over 100 runs
The final row gives the actual average times for `n = 10^8`.

Normalized Average Times

n RC3 ZOCorig ZOC2 COJ ----------------------------------------------- 10^3 1.53 3.84 1 1.16 10^4 1.26 1.49 1 1.11 10^5 1.19 1.24 1 1.15 10^6 1.15 1.25 1 1.31 10^7 1.15 1.24 1 1.32 10^8 1.14 1.26 1 1.32 ----------------------------------------------- Actual(secs)10^8 3.84 4.23 3.36 4.44

I was surprised by this result: `ZOC2` is uniformly better than `COJ`.

I profiled both functions with `n = 10^8` and compared statement counts (timing info is ignored because it is useless for comparisons).

COJ ZOC2

1 10 for k = 1:n 1 11 for k = 1:n 100000000 11 if v(k) 100000000 12 if v(k) == 0 49999019 12 s = s + 1; 50000981 13 inOnes = false; 49999019 13 inOnes = true; 49999019 14 elseif ~inOnes 50000981 14 elseif inOnes 25002599 15 inOnes = true; 25002599 15 k1 = k1 + 1; 25002599 16 k1 = k1+1; 25002599 16 len(k1) = s; 25002599 17 len(k1) = 1; 25002599 17 s = 0; 24996420 18 else 25002599 18 inOnes = false; 24996420 19 len(k1) = len(k1)+1; 25002599 19 end 24996420 20 end 100000000 20 end 100000000 21 end

At first glance there seems to be very little difference in the counts: `if-elseif` about the same, `inOnes = true-false`, about the same.

The only difference I can see is that `COJ` does about `75x10^6` ops on the counter `s` (lines 12,17) which are not done by `ZOC2`. However, `ZOC2` does about `25x10^6` ops extra on `len` (line 19) which are not done by `COJ`.

Thus `COJ` is doing about `50x10^6` extra assignment operations. This would seem to account for the difference between the two functions.

Show 1 older comment

Derek O'Connor
on 23 Sep 2011

None of the functions in the timing tests above uses uint32, so your comment is not about these results but about new functions that use uint32. If you have written new functions and done timing tests then please don't be coy, let us have your results.

Indeed, given COJ and ZOC2 are designed for zeros and ones, why not use uint16, uint8, uint4, and last, but not least, UINT1 -- the bit vector?

Your second comment, unlike your first comment, is about ZOC2 ('your function'), when you ask "Does your function catch the last 1? E.g. what is the reply for [1]?" Again you are being coy. Have you found that ZOC2 does not work for this contrived example? If so tell us. Better still, tell us how to fix it.

Answer by Jos (10584)
on 18 Mar 2014

Another idea, not tested for speed:

A = [0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 1 0 0]

B = cumsum(a) tf = B(2:end) == B(1:end-1) & A(1:end-1)==1 result = diff([0 B(tf)])

Opportunities for recent engineering grads.

## 0 Comments