# Cumulative sum by group

37 views (last 30 days)
zppine on 27 Jul 2012
Answered: Quoc T. Phan on 19 Jun 2020
Hi, Could someone explain how to compute the running sum by group (without using for loop)? A example is: groupid = [1; 1; 1; 2; 2; 3; 3;]; value=[1;2;3;4;5;6;7]. the desired result is: runsum = [1;3;6; 4;9; 6;13]. Thanks a lot. zppine

Sean de Wolski on 27 Jul 2012
Edited: Sean de Wolski on 27 Jul 2012
Using a for-loop here may be your best option. However, my unconditional love for accumarray forces me to present you with this:
groupid = [1; 1; 1; 2; 2; 3; 3;];
value=[1;2;3;4;5;6;7];
gcs = cell2mat(accumarray(groupid,value,[],@(x){cumsum(x)}))

zppine on 27 Jul 2012
Hi, Sean. Very good idea. Thanks a lot.

Daniel on 3 Jul 2015
function S = labeledCumsum(X,L)
% e.g. X=[3 5 8 9 1 2 5 8 5 4 9 2]
% L=[0 1 1 2 2 0 0 1 1 1 0 5]
% result=[NaN 5 13 9 10 NaN NaN 8 13 17 NaN 2]
L = L(:);
X = X(:);
if numel(L) ~= numel(X)
error('The two inputs should be vectors of the same length.')
end
% Do the full cumulative sum
X(isnan(X)) = 0;
S = cumsum(X);
% Lookup the cumulative value just before the start of each segment
isStart = mask & [true ; L(1:end-1) ~= L(2:end)];
startInds = find(isStart);
if startInds(1) == 1
S_starts = [0 ; S(startInds(2:end)-1)];
else
S_starts = S(startInds-1);
end
% Subtract off the excess values (i.e. the ones obtained above)
L_safe = cumsum(isStart); % we do this to main case the labels in L were not sorted integers
% Put NaNs in the false blocks
S(L==0) = NaN;
end

Quoc T. Phan on 19 Jun 2020
Hello
When it comes to work with group, these two functions are great tools. They go together
1. findgroups: --> indexing groups
2. splitapply: --> split your data by index, then apply @function to it, then combine back to original
Here is a sample code
clear
% Create a table X
X = table(LastName,Gender,Height,Weight,Age)
% Sort rows by Gender and Last Name (Actually u dont need Last Name, but I
% just put here in case)
X = sortrows(X,{'Gender' 'LastName'},{'ascend' 'ascend'})
% Function 01: Using findgroups --> to return logical index
G = findgroups(X.Gender)
% (*) Note: by default they index by alphabet order. Although 1st obs is male,
% however, F(emale) is before M(ale). Then 1 is Female and 2 is Male.
% Function 02: Using splitapply
% Basically what it does it slit table X by logical index G
% It then apply function cumsum for variable X.Height
cumHeight = splitapply(@(x1){cumsum(x1)},X.Height,G)
% Change cell to matrix and add to table X
X.cumHeight = cell2mat(cumHeight)