How do I average all the values for each column in a cell array?

25 views (last 30 days)
Hi,
I have a cell array called new_mat. I would like to compute the mean of all the values in each column and save the result in a new array called averages. I would then have a numerical array with one row and five columns, so five values in total.
How would I do that?
I tried this
avg_cols = cellfun(@(x) mean(x, 1), new_mat, 'UniformOutput', false);
But I still get this
avg_cols =
5×5 cell array
{[ 2.9473]} {[ 0.7736]} {[24.7335]} {[-32.1028]} {[ 5.4609]}
{[ 7.9357]} {[15.6115]} {[28.3915]} {[ 51.8624]} {[ 1]}
{[38.3376]} {[62.5463]} {[35.4955]} {[ 17.6059]} {[ 35.9168]}
{[15.0732]} {[24.9668]} {[ 3.2505]} {[-21.6557]} {1×0 double}
{[57.9756]} {[49.9486]} {[53.4301]} {[ 45.9361]} {[-17.1092]}
Any ideas why the columns are not averaged?

Accepted Answer

Voss
Voss on 13 Dec 2022
Edited: Voss on 13 Dec 2022
cellfun operates on the contents of each cell independently, performing the specified function (in this case the function is mean(x,1)). If the outputs of those function calls are all scalars of the same class, then cellfun is able to combine all the results into an array (which it will do by default). Otherwise, you need to use 'UniformOutput',false to have cellfun return a cell array.
Examples:
C1 = {[1 1] [2]} % cell array with two cells: 1st contains a 1x2 vector, 2nd contains a scalar
C1 = 1×2 cell array
{[1 1]} {[2]}
try
result = cellfun(@(x)x,C1) % the function @(x)x just returns the contents of the cell
catch e % error: non-scalar output at 1st cell (which is [1 1] - obviously non-scalar).
% That is, cellfun can't combine [1 1] and [2] into a 1-by-2 matrix (the size of C)
disp(e.message);
end
Non-scalar in Uniform output, at index 1, output 1. Set 'UniformOutput' to false.
C2 = {single(1) double(2)} % cell array with two cells: both containing scalars but different classes
C2 = 1×2 cell array
{[1]} {[2]}
try
result = cellfun(@(x)x,C2) % the function @(x)x just returns the contents of the cell (again)
catch e % error: mismatch in type of outputs (single vs double)
% That is, cellfun doesn't know what class the result should be
disp(e.message);
end
Mismatch in type of outputs, at index 2, output 1 (single versus double). Set 'UniformOutput' to false.
In both of those examples, you must use 'UniformOutput',false to have cellfun return a cell array instead of trying to construct a numeric matrix and erroring-out. Of couse, since the function is @(x)x, the resulting cell array from cellfun will be the same as what you gave it.
result = cellfun(@(x)x,C1,'UniformOutput',false)
result = 1×2 cell array
{[1 1]} {[2]}
isequal(result,C1) % the same as what you started with
ans = logical
1
result = cellfun(@(x)x,C2,'UniformOutput',false)
result = 1×2 cell array
{[1]} {[2]}
isequal(result,C2) % the same as what you started with
ans = logical
1
Now, to turn to your cell array:
load new_mat
new_mat % notice the cell in the 4th row, 5th column contains an empty array
new_mat = 5×5 cell array
{[ 2.9473]} {[ 0.7736]} {[24.7335]} {[-32.1028]} {[ 5.4609]} {[ 7.9357]} {[15.6115]} {[28.3915]} {[ 51.8624]} {[ 1]} {[38.3376]} {[62.5463]} {[35.4955]} {[ 17.6059]} {[ 35.9168]} {[15.0732]} {[24.9668]} {[ 3.2505]} {[-21.6557]} {0×0 double} {[57.9756]} {[49.9486]} {[53.4301]} {[ 45.9361]} {[-17.1092]}
new_mat consists of 24 cells that contain a scalar and one cell that contains an empty array. The function you want to run is @(x)mean(x,1), which will return a scalar on each of the 24 cells containing scalars and will return an empty array on the cell that contains an empty array. Since not all results will be scalars, you must use 'UniformOutput',false. Of course, the mean of a scalar is the scalar itself and the mean of an empty array is an empty array, so the result you get is essentially what you started with (the difference is that the 0x0 empty array becomes 1x0 when passed through mean(x,1)).
result = cellfun(@(x)mean(x,1),new_mat,'UniformOutput',false)
result = 5×5 cell array
{[ 2.9473]} {[ 0.7736]} {[24.7335]} {[-32.1028]} {[ 5.4609]} {[ 7.9357]} {[15.6115]} {[28.3915]} {[ 51.8624]} {[ 1]} {[38.3376]} {[62.5463]} {[35.4955]} {[ 17.6059]} {[ 35.9168]} {[15.0732]} {[24.9668]} {[ 3.2505]} {[-21.6557]} {1×0 double} {[57.9756]} {[49.9486]} {[53.4301]} {[ 45.9361]} {[-17.1092]}
isequal(result,new_mat) % not the same
ans = logical
0
isequal(result([1:23 25]),new_mat([1:23 25])) % but the only difference is in the 24th cell (row 4, column 5)
ans = logical
1
OK, so that's a cellfun primer. As I said, cellfun operates on each cell independently. But you want to operate on columns of cells together, so that makes cellfun ill-suited to the task. It's straightforward to write a loop to do what you want:
N = size(new_mat,2);
result = zeros(1,N);
for ii = 1:N
result(ii) = mean([new_mat{:,ii}]);
end
disp(result)
24.4539 30.7694 29.0602 12.3292 6.3171
which could also be written:
N = size(new_mat,2);
result = zeros(1,N);
for ii = 1:N
result(ii) = mean(vertcat(new_mat{:,ii}));
end
disp(result)
24.4539 30.7694 29.0602 12.3292 6.3171
The difference being that the first loop horizontally concantenates the contents of the cells in a given column, and the second loop vertically concatenates the contents of the cells in a given column. In either case the result of that concatenation is a vector, so no dimension argument is required for mean (that is, it's mean(x), not mean(x,1)), but you could include one (it would be 2 for the horizontal concatentation case and 1 for the vertical).
Note that the dimension argument sent to mean has nothing to do with how the cells are arranged in new_mat! You're wanting to do @(x)mean(x,1) because you are thinking of averaging a column of cells, but the function @(x)mean(x,1) when used in cellfun doesn't operate on a column of cells - it operates on one cell at a time. Each cell contains a scalar (or empty array), so the dimension argument passed in to mean() is irrelevant.
In order to take the mean of several elements at a time, you've got to concatenate them together somehow - that's what code inside the loops does ([new_mat{:,ii}] to horizontally concatenate the contents of the cells in the iith column of new_mat, or vertcat(new_mat{:,ii}) to concatenate the same things vertically).
  3 Comments
lil brain
lil brain on 13 Dec 2022
Hi @Voss! Wow, this is super helpful and detailed. Many thanks for that! I learned a lot just reading this. I understand the problem much better now.
One last thing however, since you mentioned that taking the mean of a scalar simply gives you the scalar itself, would it make sense to convert the cells that contain scalars to something else? Would that allow me to create the mean along the column dimension then?
Thanks again for the help!
Voss
Voss on 13 Dec 2022
@lil brain: You're welcome! I'm glad it's useful.
"would it make sense to convert the cells that contain scalars to something else?"
I don't know what you'd convert them to.
If you didn't have that one cell that contains an empty array, then you could convert the entire 5-by-5 cell array to a numeric matrix. Let's say you replace that empty array in that one cell with a scalar NaN:
load new_mat
new_mat{4,5} = NaN
new_mat = 5×5 cell array
{[ 2.9473]} {[ 0.7736]} {[24.7335]} {[-32.1028]} {[ 5.4609]} {[ 7.9357]} {[15.6115]} {[28.3915]} {[ 51.8624]} {[ 1]} {[38.3376]} {[62.5463]} {[35.4955]} {[ 17.6059]} {[ 35.9168]} {[15.0732]} {[24.9668]} {[ 3.2505]} {[-21.6557]} {[ NaN]} {[57.9756]} {[49.9486]} {[53.4301]} {[ 45.9361]} {[-17.1092]}
Now all the cells contain scalars, so you can put all those scalars together into a numeric matrix the same size as your original cell array:
% using cell2mat:
M = cell2mat(new_mat)
M = 5×5
2.9473 0.7736 24.7335 -32.1028 5.4609 7.9357 15.6115 28.3915 51.8624 1.0000 38.3376 62.5463 35.4955 17.6059 35.9168 15.0732 24.9668 3.2505 -21.6557 NaN 57.9756 49.9486 53.4301 45.9361 -17.1092
% or, concatenating and reshaping:
M = reshape([new_mat{:}],size(new_mat))
M = 5×5
2.9473 0.7736 24.7335 -32.1028 5.4609 7.9357 15.6115 28.3915 51.8624 1.0000 38.3376 62.5463 35.4955 17.6059 35.9168 15.0732 24.9668 3.2505 -21.6557 NaN 57.9756 49.9486 53.4301 45.9361 -17.1092
Now, I don't know, in your application, whether using a NaN instead of an empty array is a good idea (maybe you still need to distinguish NaN from empty, in which case you don't want to replace one with the other), but if it makes sense to do that (or use some other scalar place-holder value like Inf), then it's convenient to use a numeric matrix like above instead of a cell array where all the cells contain a scalar.

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 12 Dec 2022
avg_cols = cellfun(@(x) mean(x, 1), new_mat);
You only need non-uniform output if some of the outputs can be a different size or datatype than the others, or if the output datatype is one that cannot be concatenated into an array (for example, function handles)
  2 Comments
lil brain
lil brain on 12 Dec 2022
Thnaks!
However, if I try this it gives me the error:
Error using cellfun
Non-scalar in Uniform output, at index 24, output 1.
Set 'UniformOutput' to false.
Is this because some columns are not equal in length?
Walter Roberson
Walter Roberson on 13 Dec 2022
Ah, you have an empty cell. mean of empty is empty. That prevents you from creating a numeric array of results.
If you were to
mask = cellfun(@isempty, new_mat);
new_mat(mask) = nan;
Then the cellfun would return nan for those entries

Sign in to comment.

Categories

Find more on Matrices and Arrays in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!