Average Multiple Matrix Columns Based On Duplicate Entries In A Different ColumnVector

5 views (last 30 days)
I have a column vector consisting of hundreds of unit64 values where duplicates appear and are always grouped together but groups appear only once. Each element of this vector corresponds to a row in a separate 2D matrix of double values. I would like to remove the duplicates from the row vector and then remove the corresponding rows in the matrix, replacing them with a single row where each column is average the removed elements in the column, while keeping the order stable. So with data like this:
1
1
2
A= 2
3
3
1 2 3
2 3 4
3 4 5
B = 4 5 6
5 6 7
6 7 8
I would want the output to be
1.5 2.5 3.5
C = 3.5 4.5 5.5
5.5 6.5 7.5
Some combination of unique and accumarray seems to be in order but I cannot figure out how to handle the multiple coulmn part of the problem.

Accepted Answer

Guillaume
Guillaume on 13 Jul 2018
Edited: Guillaume on 13 Jul 2018
So if I understood correctly, the shape of B does not matter and it is to be considered a column vector.
[~, ~, subs] = unique(A, 'stable');
C = accumarray(subs, B(:), [], @mean)
Note that whether or not the groups appear only once does not matter for the above. All the values with the same corresponding A still get averaged.
Also note that if A is already integer values from 1 to n with no gap and in the right order, then the call to unique is unnecessary and you can just pass A instead of subs.
  6 Comments
Jim Sculley
Jim Sculley on 16 Jul 2018
This works great for my sample data above, but when I try it with my real data I am getting an error:
Error using vertcat
Dimensions of arrays being concatenated are not consistent.
Error in splitapply>localapply (line 257)
finalOut{curVar} = vertcat(funOut{:,curVar});
Error in splitapply (line 132)
varargout = localapply(fun,splitData,gdim,nargout);
My A vector is 1335 x 1 (unit64)
My B matrix is 1335 x 632 (double)
findgroups returns a 1335 x 1 (double) with indices from 1 to 516.
I'm not sure why splitapply would be having a problem with this.
Guillaume
Guillaume on 16 Jul 2018
Edited: Guillaume on 16 Jul 2018
C = splitapply(@(m) mean(m, 1), B, g);
should fix the problem. If a group only contain one row, the mean will be taken along the columns of the one row instead of across the rows as it happens for matrices. As a result, you'd get a scalar value instead of a row, hence the Dimensions of arrays being concatenated are not consistent. @(m) mean(m, 1) forces the mean to be taken across the rows regardless.

Sign in to comment.

More Answers (0)

Categories

Find more on Numeric Types in Help Center and File Exchange

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!