Using reshape to get a block averaging of data when it is not divisible by the block?

11 views (last 30 days)
Have varying data sets of varying lengths i want to block average (every 10 points), block max/min, stdev, etc. How can I use reshape to do so and avoid the not divisible error? for example my data set is 1086 rows and i want every 10, the last set can be an average of 6 points only which is fine.

Accepted Answer

Sean de Wolski
Sean de Wolski on 24 Jun 2015
Edited: Sean de Wolski on 26 Jun 2015
If you have the Image Processing Toolbox, you could use blockproc for this.
% data
x = rand(1086,104);
% Mean
blockproc(x,[10 10],@(s)mean(s.data(:)))
% Min
blockproc(x,[10 10],@(s)min(s.data(:)))
etc.

More Answers (1)

dpb
dpb on 24 Jun 2015
Edited: dpb on 24 Jun 2015
Just use modulo arithmetic to compute the block size and remainder...
L=size(x,1); % array length
N=n*fix(L/n); % N total lines of n length possible
M=mod(L,n); % the remainder final block size
You then use reshape(x(1:N,:),n,[]) for the large section and concatenate the statistics for the final segment to the results of the larger.
Example:
Let n=4 and there be three columns...
>> x=rand(11,3); % generate a set of data
>> n=4; % the block size
>> L=length(x); % the total length
>> N=n*fix(L/n) % the number of lines _can_ reshape by n
N =
8
>> m=[reshape(mean(reshape(x(1:N,:),n,[])),N/n,[]); ,,,
mean(x(N+1:end,:))]
m =
0.4124 0.6394 0.6423
0.7079 0.4336 0.4435
0.6019 0.7830 0.6274
>>
Show it agrees w/ explicitly calculated values...
>> [mean(x(1:4,:));mean(x(5:8,:));mean(x(9:11,:))]
ans =
0.4124 0.6394 0.6423
0.7079 0.4336 0.4435
0.6019 0.7830 0.6274
>>
QED.
If you have multiple statistics to run you'll likely want to save the reshape'd array and operate on it by the various functions or perhaps there's one of the other Statistics Toolbox functions that generates the desired various statistics in a single call rather than do the explicit reshape multiple times.
ADDENDUM NB: the use of N/n in the second reshape operation to return to the original number of columns. I presume from your original posting that probably is obvious as apparently you've done this before, but just as a reminder to stick with the reduced array size for the blocked portion and then there's the final "cleanup" of the additional line if, as you've stated, you're ok with the lesser number of elements in the statistics for the last group. Obviously you then just ignore that set to have one less final observation but all are averaged of the full set of n records per.
  4 Comments
matlabuser12
matlabuser12 on 24 Jun 2015
I am still getting an error at the reshape stage:
product of known dimensions, 108, not divisble into total number of elements, 1086
dpb
dpb on 24 Jun 2015
Yes, you use that to determine the maximum size that can be reshaped by n and I was correct initially in computing N as the total; I just erred in the comment. Then I reacted too quickly and instead of thinking it thru changed to match the comment rather than fixing the comment.
I've revised the answer to correct it back as intended with appropriate comment and show a small illustrative example.
Sorry for the misdirection; tried to be too quick...

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!