Cumulative sum with conditions

14 views (last 30 days)
yp78
yp78 on 3 Oct 2018
Edited: Bruno Luong on 4 Oct 2018
I am counting a number of 1's in an array.
But I want to count the cumulative sums only when the number of 1's are followed by a sequence of zeros.
% For instance,
h1=[1 0 0]
h2=[1 1 0]
h3=[1 1 1]
h4=[0 0 1]
h5=[0 1 0].
I want to count the cumulative sums (S) of those arrays as follows.
% The ideal outputs are:
S1= 1
S2= 2
S3= 0
S4= 0
S5= 0
since h2: h4 do not meet the condition of 1's followed by 0's.
I would truly appreciate if any one could advise me an effective way of programming the conditional sums.
  1 Comment
Rik
Rik on 3 Oct 2018
So you have these as numbered variables? Then you should really have them in a single array. Also, I don't understand the rules here. Are we meant to look row by row? If so, doesn't the 5th row count as well (as there is a 0 after the 1)?

Sign in to comment.

Accepted Answer

Bruno Luong
Bruno Luong on 4 Oct 2018
Edited: Bruno Luong on 4 Oct 2018
Change the last line to make it can handle with single row:
m = size(h,1);
d = diff(h,1,2);
[r,c] = find(d==-1);
b = sum(abs(d),2)==1;
s = accumarray(r(:),c(:),[m 1]).*double(b)
This because find() rotates the result for row input array.
  1 Comment
yp78
yp78 on 4 Oct 2018
Edited: yp78 on 4 Oct 2018
It worked perfectly! Many thanks for your patience and kind help. It was also a good opportunity to lean new functions (for me) such as accumarray.

Sign in to comment.

More Answers (4)

Sean de Wolski
Sean de Wolski on 3 Oct 2018
h = [1 0 0;
1 1 0;
0 0 1;
1 1 1;
0 1 0];
s = sum(h, 2).*~h(:,end)
requiring the last element to be zero. This wouldn't work for [1 0 1] but I don't know what you'd want for that.

yp78
yp78 on 3 Oct 2018
Edited: yp78 on 3 Oct 2018
Hi both, Thank you for your immediate responses.
> Sean, actually your comment of:
"This wouldn't work for [1 0 1] "
is where I am struggling with. Explaining a bit in detail, I am doing a statistical test described in Test for Cointegration Using the Johansen Test. In one trial (and after some extra operation) what I get is an array of binary values 1's and 0's. e.g.
h=[1 1 0] .
In order to reach the conclusion of "there are at most 2 cointegrating vectors or less" (hence in this case the number that I want to see is the cumulative sum of 2). But if it were
h= [1 0 1]
I want to count the sum as zero as I cannot make any inference from the output.
Since I have to repeat this test 1000 times, I want to create a function, calculating the cumulative sum of an array when it meets the condition of starting from 1, 1's followed by 0's.
  2 Comments
Sean de Wolski
Sean de Wolski on 3 Oct 2018
In that case what I have above should work. It returns the sum and requires that the last element be a zero. If the last element is non-zero, it returns a zero.
With [1 0 1] it would say zero because the last element is not a zero.
yp78
yp78 on 4 Oct 2018
Edited: yp78 on 4 Oct 2018
Thank you Sean. But the problem still remains with h= [0 1 0] as it doesn't meet the condition of the array starting from 1(sorry this point was not clear in my previous comment).
When h= [0 1 0] I can't make any statistical inference, hence I need to have a return value of "0" whereas your current code gives "1".

Sign in to comment.


Bruno Luong
Bruno Luong on 4 Oct 2018
Edited: Bruno Luong on 4 Oct 2018
h = [1 0 0;
1 1 0;
0 0 1;
1 1 1;
0 1 0];
[a,c] = max(1-h,[],2);
s = (c-1).*a.*h(:,1)
  3 Comments
Bruno Luong
Bruno Luong on 4 Oct 2018
Edited: Bruno Luong on 4 Oct 2018
Can you explain why it returns 0? Still not clear the rule...
Do you only count only if the row is composing a sequence of 1s followed by a sequence of 0s (of at least one 0)?
yp78
yp78 on 4 Oct 2018
1 and 0 corresponds to the rejection (1) and non-rejection (0) of a statistical test. The hypothesis is tested sequentially. The null hypothesis is:
(trial 1) There are 0 relationship between variables
(trial 2) There are at most 1 or less statistical relationship between variables
(trial 3) There are at most 2 or less statistical relationship between variables
(trial 4) There are at most 3 or less statistical relationship between variables ...etc.
In order obtain a useful inference, I need to know when the first 0 appears after a sequence of 1's. For instance, in the case of;
h=[1 1 0 0 ]
I can make the inference as "there are 2 or less relationships found". But if it were;
h=[1 1 0 1 ]
the result becomes no longer reliable as the trial 3 says;
"there are 2 or less relationships in the data"
whereas the following trial rejects
"there are 3 or less relationships in the data" which is contradictory to the previous trial.

Sign in to comment.


Bruno Luong
Bruno Luong on 4 Oct 2018
Edited: Bruno Luong on 4 Oct 2018
m = size(h,1);
d = diff(h,1,2);
[r,c] = find(d==-1);
b = sum(abs(d),2)==1;
s = accumarray(r(:),c(:),[m 1]).*double(b)
  4 Comments
Torsten
Torsten on 4 Oct 2018
Edited: Torsten on 4 Oct 2018
m = size(h,1);
yp78
yp78 on 4 Oct 2018
Very sorry for asking you again. But I still got an error saying:
Error using accumarray Second input VAL must be a vector with one element for each row in SUBS, or a scalar.
Where the code is written as follows.
h = [1 0 1 0 ];
m = size(h,1);
d = diff(h,1,2);
[r,c] = find(d==-1);
b = sum(abs(d),2)==1;
s = accumarray(r,c,[m 1]).*double(b)

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!