For vectors, Y = RUNMEAN(X,M) computes a running mean (also known as moving average) on the elements of the vector X. It uses a window of 2*M+1 datapoints. M an positive integer defining (half) the size of the window. In pseudo code:
Y(i) = sum(X(j)) / (2*M+1), for j = (i-M):(i+M), and i=1:length(X)
For matrices, Y = RUNMEAN(X,M) or RUNMEAN(X,M,) operates on the first non-singleton dimension of X. RUNMEAN(X,M,DIM) computes the running mean along the dimension DIM.
If the total window size (2*M+1) is larger than the size in dimension DIM, the overall average along dimension DIM is computed.
As always with filtering, the values of Y can be inaccurate at the edges. RUNMEAN(..., MODESTR) determines how the edges are treated. MODESTR can be one of the following strings:
'edge' : X is padded with first and last values along dimension DIM (default)
'zero' : X is padded with zeros
'mean' : X is padded with the mean along dimension DIM
X should not contains NaNs, yielding an all NaN result. NaNs can be replaced by using, e.g., "inpaint_nans" created by John D'Errico.
This is an incredibly fast implementation of a running mean, since execution time does not depend on the size of the window.
This is version 3.0 (sep 19, 2006). The previous posted code of version 1.3 is attached at the end.
The cumsum method of calculating moving averages can result in large errors. see my comment on MOVING_AVERAGE which has the same problem.
Hi Jos, I haven't test your code and it's amazing. I post here in the FEx the moving_average.m, which is too much slower, but do a different treatment with the edges, that works for me, and also works in 2D and with NaN's. I'll change my codes to use your faster cumsum way. Best regards!
Nice idea, well implemented, fast. Treatment of edges is always a problem.
Would be great if somebody extended that for arbitrary dimensions ...
Hazem, thanks for your comments. Indeed, the help section can be improved. I apologize for the confusion.
I will soon put up a new version of runmean.
Jos, sorry to keep bothering you. Your help text confused me. You say: "Y = RUNMEAN(X,M) computes a running mean on vector X using a window of M datapoints." Only later do you mention a contradictory statement: "...and M an positive odd integer defining (half) the size of the window". That is what threw me off.
Your code is certainly more speedy, but still has the problem with the edges. I will look into the division errors in slidefilter more closely.
As for the unwanted phase shift, any averaging operation can cause this, although using 'filter' (as in movave) makes it more pronounced.
I'd like to see your program incorporate some improvements on the edges. That would make it the best on FileExchange as regards both speed and edge treatment.
Hazem, please use runmean(c,15) (half window size !) for a proper comparison.
As you see, <movave> introduces an unwanted phase lead. Also both <slidefilter> and <movave> are slower, less efficiently coded, and more prone to floating point division errors than <runmean> ...
Try the following to see the problem:
% create vector:
a = 0:0.1:100;
% construct noisy sinusoid:
c = rand(1,length(a)).*sin(a);
% Plot compare:
I just did the test s = [1 1 1 4 1 1 1]; slidefilter(s,3) and i got the answer [1 1 2 2 2 1 1]
So what's the problem? in addition the length of the input is the same as that of the output. I will send an updated version to FileExchange just in case the one they have has errors.
I don't think you quite grasp the trick with cumsum ...
Unfortunately your code introduces (floating point arithmetic) errors. Test it yourself with a simple array like A = [1 1 1 4 1 1 1] and a window size of three. runmean(A,1) gives correctly [. 1 2 2 2 1 .'].
I will update the help soon.
When you do the following division
Y = (X(m+1:end)-X(1:end-m)) / m
you are not making a distinction between edges and center. You are dividing by the number m which should be the (maximum) number of elements to average over, whereas it is better to treat that number as variable near the edges. This is equivalent to adding filler zeros into the average. Sorry I was not clear. I had the problems I faced when I wrote my code in mind, even though your approach is not exactly the same.
You are right about the inputs not being duplicated. I thought that was what you were doing with repmat. Another look at the code cleard it up for me.
But there is a more serious potential problem that I couldn't figure out. Even in the center region your results don't match mine not even approximately, and they don't do as good a job of smoothing. I am not sure why that is. Both slidefilter.m and movave.m do match each other closely, so the indication is that there is some calculation error in runmean. I am not absolutely sure, though.
Another problem I found was in the condition of the window being larger than length(X). You already redefined m = 2*m+1, which means that the condition is that the window be about half the length of the input.
You also talk about the input being a "2D vector" while I think you mean a 1-D vector.
Hazem, I really don't see your points ...
1. filtering is always causing problems and every solution is arbitrary.
2. the mean and repmat function are only used when the window is larger than the input (I'll add a warning in an update)
3. It is unlikely that memory problems will occur, since the inputs are not duplicated.
4. I do not add filler zeros.
Did you even bother to look at the code?
Also, try to get rid of the mean function call to increase efficiency. -- Good luck.
This an interesting treatment. Similar concept to my slidefilter, but with some drawbacks. Your treatment of the edges is really bad, and I think it is because you are adding nonexistent filler zeros into your average. Try treating the edges separately like I've done, or maybe in a unified manner if you can figure out how. Compare slidefilter performance for a given data set and window with that of movave and runmean. You'll see that movave is bad on one edge only, while runmean is bad on both. Your clever use of repmat and built-in functions might cause memory issues if the input vector is too long. I opted to update the sum rather than remember a huge matrix, admittedly at the expense of speed.
Overall good approach but has potential for much improvement. Try to keep the advantage of speed if there is a way to do it. I would be interested in seeing your work. -Hazem
Major change (1.3 - 3.0): can now operate along specific dimension, and contains several possible treatments of the edges.
added a warning when window size is too large
error in "inspired by"