Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Thread Subject:
how to write an effcient code for expanding stdev?

Subject: how to write an effcient code for expanding stdev?

From: andy Lu

Date: 11 Jan, 2010 05:58:02

Message: 1 of 18

say for a vector with n elements: a1, a2, ..., an. I want to compute the expanding stdev. and store in vector b, i.e. bm = stdev(a1, a2, ..., am).

It seem if I just use simple for loop, it will be very slow. It is also very difficult to vectorize.

any good suggestion is greatly appreciated.

Subject: how to write an effcient code for expanding stdev?

From: Rune Allnor

Date: 11 Jan, 2010 06:32:40

Message: 2 of 18

On 11 Jan, 06:58, "andy Lu" <luy...@gmail.com> wrote:
> say for a vector with n elements: a1, a2, ..., an.  I want to compute the expanding stdev.  and store in vector b, i.e. bm = stdev(a1, a2, ..., am).
>
> It seem if I just use simple for loop, it will be very slow.

What makes you think that? What are your alternatives?

> It is also very difficult to vectorize.
>
> any good suggestion is greatly appreciated.

Learn basic programming.

Rune

Subject: how to write an effcient code for expanding stdev?

From: Jan Simon

Date: 11 Jan, 2010 09:28:03

Message: 3 of 18

Dear Rune!

> > any good suggestion is greatly appreciated.
> Learn basic programming.
> Rune

Sorry for the trivial question: Do you mean BASIC or basic?

In case of "BASIC": I do not think, that the computations are faster in BASIC, even if it is a version with a compiler. There have been some really fast BASIC compilers for the Amiga, but this is not longer supported.

In case of "basic": Exactly this is the aim of the poster! He wants to learn basic programming and asked for help.

@Andy:
>say for a vector with n elements: a1, a2, ..., an. I want to compute the expanding stdev. and store in vector b, i.e. bm = stdev(a1, a2, ..., am).
>It seem if I just use simple for loop, it will be very slow. It is also very difficult to vectorize.

If you post your loop, I'm sure some readers have ideas for improvements. Surely you get the best help, if you ask a question.

Kind regards, Jan

Subject: how to write an effcient code for expanding stdev?

From: Jos (10584)

Date: 11 Jan, 2010 09:38:02

Message: 4 of 18

"andy Lu" <luyi36@gmail.com> wrote in message <hieela$8nu$1@fred.mathworks.com>...
> say for a vector with n elements: a1, a2, ..., an. I want to compute the expanding stdev. and store in vector b, i.e. bm = stdev(a1, a2, ..., am).
>
> It seem if I just use simple for loop, it will be very slow. It is also very difficult to vectorize.
>
> any good suggestion is greatly appreciated.


It is so easy to vectorize, as long as you know your math:

% example data
  m = 10 ;
  a = rand(1,m) ;

% engine
  N = 1:m ;
  s1 = cumsum(a) ;
  s2 = cumsum(a.^2) ;
  % sample standard deviation
  SD0 = sqrt((N .* s2 - s1.^2) ./ (N .* (N-1))) ;
  % normalized by N
  SD1 = sqrt(N .* s2 - s1.^2) ./ N ;

% check (be aware of round-off issues)
for r=1:m,
  disp([r std(a(1:r),0) SD0(r) std(a(1:r),1) SD1(r)]) ;
end

John D'Errico has also contributed something similar to the FEX:
http://www.mathworks.com/matlabcentral/fileexchange/9428

Subject: how to write an effcient code for expanding stdev?

From: Oleg Komarov

Date: 11 Jan, 2010 09:41:04

Message: 5 of 18

"Jan Simon"
> Dear Rune!
>
> > > any good suggestion is greatly appreciated.
> > Learn basic programming.
> > Rune
>
> Sorry for the trivial question: Do you mean BASIC or basic?
>
> In case of "BASIC": I do not think, that the computations are faster in BASIC, even if it is a version with a compiler. There have been some really fast BASIC compilers for the Amiga, but this is not longer supported.
>
> In case of "basic": Exactly this is the aim of the poster! He wants to learn basic programming and asked for help.
>
> @Andy:
> >say for a vector with n elements: a1, a2, ..., an. I want to compute the expanding stdev. and store in vector b, i.e. bm = stdev(a1, a2, ..., am).
> >It seem if I just use simple for loop, it will be very slow. It is also very difficult to vectorize.
>
> If you post your loop, I'm sure some readers have ideas for improvements. Surely you get the best help, if you ask a question.
>
> Kind regards, Jan
I think Rune intended BaSiC and not BASIC or basic, which aer the same or are they not?
Kidding apart, basic MATLAB programming gives the opportunity to vectorize most of the problems.

First of all be precise, what do you mean and expanding Stdev.
Second give us some details, the script you're using or better a simplified version with the an example of the inputs and the outputs you want to achieve.

Oleg

Subject: how to write an effcient code for expanding stdev?

From: Rune Allnor

Date: 11 Jan, 2010 11:36:16

Message: 6 of 18

On 11 Jan, 10:28, "Jan Simon" <matlab.THIS_Y...@nMINUSsimon.de> wrote:
> Dear Rune!
>
> > > any good suggestion is greatly appreciated.
> > Learn basic programming.
> > Rune
>
> Sorry for the trivial question: Do you mean BASIC or basic?

'Basic' as in 'trivial', 'elementary', 'introductory'...

Rune

Subject: how to write an effcient code for expanding stdev?

From: Jan Simon

Date: 11 Jan, 2010 12:44:03

Message: 7 of 18

Dear Rune!

> > Sorry for the trivial question: Do you mean BASIC or basic?
> 'Basic' as in 'trivial', 'elementary', 'introductory'...
> Rune

Then I can calm down again. It would have been too exciting, if a C-expert like you advice to use BASIC.

Anyway, I did not want to offend any BASIC users here! Especially I do love GOTO, because it is the only intelligent command - if we define intelligence by the ability to break out of a loop everywhere.

Jan

Subject: how to write an effcient code for expanding stdev?

From: andy Lu

Date: 11 Jan, 2010 15:09:04

Message: 8 of 18

"Jos (10584) " <#10584@fileexchange.com> wrote in message <hierhq$or0$1@fred.mathworks.com>...
> "andy Lu" <luyi36@gmail.com> wrote in message <hieela$8nu$1@fred.mathworks.com>...
> > say for a vector with n elements: a1, a2, ..., an. I want to compute the expanding stdev. and store in vector b, i.e. bm = stdev(a1, a2, ..., am).
> >
> > It seem if I just use simple for loop, it will be very slow. It is also very difficult to vectorize.
> >
> > any good suggestion is greatly appreciated.
>
>
> It is so easy to vectorize, as long as you know your math:
>
> % example data
> m = 10 ;
> a = rand(1,m) ;
>
> % engine
> N = 1:m ;
> s1 = cumsum(a) ;
> s2 = cumsum(a.^2) ;
> % sample standard deviation
> SD0 = sqrt((N .* s2 - s1.^2) ./ (N .* (N-1))) ;
> % normalized by N
> SD1 = sqrt(N .* s2 - s1.^2) ./ N ;
>
> % check (be aware of round-off issues)
> for r=1:m,
> disp([r std(a(1:r),0) SD0(r) std(a(1:r),1) SD1(r)]) ;
> end
>
> John D'Errico has also contributed something similar to the FEX:
> http://www.mathworks.com/matlabcentral/fileexchange/9428

Thanks a lot Jos.

Any thoughts on similar case of expanding max?

Subject: how to write an effcient code for expanding stdev?

From: Derek O'Connor

Date: 11 Jan, 2010 16:18:19

Message: 9 of 18

"andy Lu" <luyi36@gmail.com> wrote in message <hifeug$dcj$1@fred.mathworks.com>...
> "Jos (10584) " <#10584@fileexchange.com> wrote in message <hierhq$or0$1@fred.mathworks.com>...
> > "andy Lu" <luyi36@gmail.com> wrote in message <hieela$8nu$1@fred.mathworks.com>...
> > > say for a vector with n elements: a1, a2, ..., an. I want to compute the expanding stdev. and store in vector b, i.e. bm = stdev(a1, a2, ..., am).
> > >
> > > It seem if I just use simple for loop, it will be very slow. It is also very difficult to vectorize.
> > >
> > > any good suggestion is greatly appreciated.
> >
> >
> > It is so easy to vectorize, as long as you know your math:
> >
> > % example data
> > m = 10 ;
> > a = rand(1,m) ;
> >
> > % engine
> > N = 1:m ;
> > s1 = cumsum(a) ;
> > s2 = cumsum(a.^2) ;
> > % sample standard deviation
> > SD0 = sqrt((N .* s2 - s1.^2) ./ (N .* (N-1))) ;
> > % normalized by N
> > SD1 = sqrt(N .* s2 - s1.^2) ./ N ;
> >
> > % check (be aware of round-off issues)
> > for r=1:m,
> > disp([r std(a(1:r),0) SD0(r) std(a(1:r),1) SD1(r)]) ;
> > end
> >
> > John D'Errico has also contributed something similar to the FEX:
> > http://www.mathworks.com/matlabcentral/fileexchange/9428
>
> Thanks a lot Jos.
>
> Any thoughts on similar case of expanding max?

-------------------------------

This is a note of caution to those who wish to write their own
variance functions.

The definition of sample variance of x(1:n) is

S2 = sum[(x-xbar)^2]/(n-1), where xbar = sum(x)/n.

This requires a two pass algorithm: 1st pass to calculate xbar
and a 2nd pass to calculate sum((x-xbar))^2.

The definition can be re-arranged to get

S1 = [sum(x^2) - (sum(x))^2/n]/(n-1).

This requires just 1 pass because a single loop can be used to
calculate sum(x^2) and sum(x). This yields a more elegant and
"efficient" algorithm and was often seen in older statistics
textbooks.

Unfortunately the 1-pass algorithm can suffer from severe
cancellation in floating point arithmetic. This may cause S1 to be
negative and a crash when sdev = sqrt(S1) is attempted.

Many calculators have used the 1-pass algorithm and Microsoft's
Excel used it until it was corrected in a recent version. The
programmer 'Harry' in the Climategate Affair used the 1-pass
algorithm and got a large negative S1. Ironically, this was not
due to the 1-pass algorithm, but integer overflow and
wrap-round to negative. 'Harry' didn't understand 32-bit signed
integer arithmetic in Fortran. Matlab, of course, uses the
2-pass algorithm.

See N. Higham, "Accuracy and Stability of Numerical Algorithms",
2nd Ed., SIAM 2002, page 11, on computing the variance.

Here is a Matlab function that compares the 1 and 2 pass
algorithms:
---------------------------------------
function S = PlotS1vS2;

% Plotting the results of 1-pass and 2-pass algorithms
% for the variance of [M M+1 M+2], M = 2^p

% Derek O'Connor, Feb 2008. derekroconnor@eircom.net

pows2 = 1:60;
n = length(pows2);
S = zeros(n,3);
for p = pows2
    M = 2^p;
    x = [M M+1 M+2];
    s = x(1) + x(2) + x(3);
    S(p,1) = x(1)^2 + x(2)^2 + x(3)^2 - s^2/3; % One Pass
    S(p,2) = (x(1)-s/3)^2 + (x(2)-s/3)^2 + (x(3)-s/3)^2; % Two Pass
    S(p,3) = 2*var(x); % Matlab
end

figure1 = figure;

subplot1 = subplot(1,2,1,'Parent',figure1);
plot(S(1:50,1),'k.-');
ylim(subplot1,[0 4]);

xlabel('p, where x=[2^p 2^p+1 2^p+2];');
ylabel('S_1(x)');
subplot(1,2,2)
plot(S(1:60,2),'k.-');
xlabel('p, where x=[2^p 2^p+1 2^p+2];');
ylabel('S_2(x)');

------------------------------------

The most insidious thing about the 1-pass algorithm is this: it
is right most of the time, but it is always waiting in ambush
for the right(wrong) data to come along.

Derek O'Connor

Subject: how to write an effcient code for expanding stdev?

From: Rune Allnor

Date: 11 Jan, 2010 18:56:18

Message: 10 of 18

On 11 Jan, 13:44, "Jan Simon" <matlab.THIS_Y...@nMINUSsimon.de> wrote:
> Dear Rune!
>
> > > Sorry for the trivial question: Do you mean BASIC or basic?
> > 'Basic' as in 'trivial', 'elementary', 'introductory'...
> > Rune
>
> Then I can calm down again. It would have been too exciting, if a C-expert like you advice to use BASIC.

Two comments:

1) I haven't seen any references to the BASIC programming
   language younger than 20 years...
2) I am not a C language excpert, as anyone remotely familiar
   with C wouuld recognize in an instant.

> Anyway, I did not want to offend any BASIC users here! Especially I do love GOTO, because it is the only intelligent command - if we define intelligence by the ability to break out of a loop everywhere.

Considering the general level on your advice on programming
as presented here, I am not at all surprised.

Rune

Subject: how to write an effcient code for expanding stdev?

From: Jos (10584)

Date: 11 Jan, 2010 21:03:20

Message: 11 of 18

"andy Lu" <luyi36@gmail.com> wrote in message <hifeug$dcj$1@fred.mathworks.com>...
> "Jos (10584) " <#10584@fileexchange.com> wrote in message <hierhq$or0$1@fred.mathworks.com>...
> > "andy Lu" <luyi36@gmail.com> wrote in message <hieela$8nu$1@fred.mathworks.com>...
> > > say for a vector with n elements: a1, a2, ..., an. I want to compute the expanding stdev. and store in vector b, i.e. bm = stdev(a1, a2, ..., am).
> > >
> > > It seem if I just use simple for loop, it will be very slow. It is also very difficult to vectorize.
> > >
> > > any good suggestion is greatly appreciated.
> >
> >
> > It is so easy to vectorize, as long as you know your math:
> >
> > % example data
> > m = 10 ;
> > a = rand(1,m) ;
> >
> > % engine
> > N = 1:m ;
> > s1 = cumsum(a) ;
> > s2 = cumsum(a.^2) ;
> > % sample standard deviation
> > SD0 = sqrt((N .* s2 - s1.^2) ./ (N .* (N-1))) ;
> > % normalized by N
> > SD1 = sqrt(N .* s2 - s1.^2) ./ N ;
> >
> > % check (be aware of round-off issues)
> > for r=1:m,
> > disp([r std(a(1:r),0) SD0(r) std(a(1:r),1) SD1(r)]) ;
> > end
> >
> > John D'Errico has also contributed something similar to the FEX:
> > http://www.mathworks.com/matlabcentral/fileexchange/9428
>
> Thanks a lot Jos.
>
> Any thoughts on similar case of expanding max?

You'd might be interested in SLIDEFUN:
http://www.mathworks.com/matlabcentral/fileexchange/12550

or try a (possible too memory consuming) trick along these lines:

A = [2 5 3 1 6 3 7] ;
runmax = max(toeplitz(repmat(A(1),numel(A),1),A))
% = (as expected) 2 5 5 6 6 7

hth
Jos

Subject: how to write an effcient code for expanding stdev?

From: Matt Fig

Date: 11 Jan, 2010 21:56:04

Message: 12 of 18

A FOR loop is simple enough here, and for a vector with 1 million elements the time is less than a tenth of a second on my old machine.

% Data
A = round(rand(1,1000000)*1000000);

% Engine
mx = A(1);
runmax2 = zeros(size(A));
runmax2(1) = mx;

for ii = 2:length(A)
    if A(ii) > mx
        mx = A(ii);
    end
    runmax2(ii) = mx;
end

Subject: how to write an effcient code for expanding stdev?

From: andy Lu

Date: 11 Jan, 2010 22:30:20

Message: 13 of 18

"Matt Fig" <spamanon@yahoo.com> wrote in message <hig6pk$4ho$1@fred.mathworks.com>...
> A FOR loop is simple enough here, and for a vector with 1 million elements the time is less than a tenth of a second on my old machine.
>
> % Data
> A = round(rand(1,1000000)*1000000);
>
> % Engine
> mx = A(1);
> runmax2 = zeros(size(A));
> runmax2(1) = mx;
>
> for ii = 2:length(A)
> if A(ii) > mx
> mx = A(ii);
> end
> runmax2(ii) = mx;
> end


Thanks and this is indeed fast already. It seems max is a much faster function than other function (e.g. std) in matlab.

Just to benefit ppl with similar questions as mine, here is a simple summary:

efficent code of roll/expanding mean => efficient code of roll/expanding stdev (with small risk of cancellation error mentioned by Derek)
expanding mean is implemented by cumsum function already built-in.
rolling mean is implemented either by filter or cumsum trick.
expanding max is fast enough using simple loop as above. (how about rolling max??)
******************************************************************


Now, the final ultimate challenge I am stuck with:

******************************************************************
How to efficiently implement the rolling/expanding percentile (special case would be median)
******************************************************************

Thanks again for all the generous help! Derek, Jos, Matt, and others....

Subject: how to write an effcient code for expanding stdev?

From: Jan Simon

Date: 11 Jan, 2010 23:37:03

Message: 14 of 18

Dear Rune!

> > Then I can calm down again. It would have been too exciting, if a
> > C-expert like you advice to use BASIC.

> 1) I haven't seen any references to the BASIC programming
> language younger than 20 years...

Fresh:
  Stephens, Rod
  Visual Basic 2010 Programmer's Reference
  ISBN-13: 978-0-470-49983-2 - John Wiley & Sons

> > Especially I do love GOTO, because it is the only intelligent command - if we define intelligence by the ability to break out of a loop everywhere.
>
> Considering the general level on your advice on programming
> as presented here, I am not at all surprised.

Then you would also not be surprised about my admiration for the COME FROM command. And the infinite loop of the Sinclair ZX81 BASIC:
  FOR I=1 TO 4E4 ... NEXT
INT16 overflow -> run "for e ver" - how stylish! I'm astonished, why the questions about "find(0:0.1:1 == 0.3)" appear so often, but nobody asks for "for i=1:Inf".

Jan

Subject: how to write an effcient code for expanding stdev?

From: Greg Heath

Date: 12 Jan, 2010 08:38:30

Message: 15 of 18

On Jan 11, 12:58 am, "andy Lu" <luy...@gmail.com> wrote:
> say for a vector with n elements: a1, a2, ..., an.  I want to compute the expanding stdev.  and store in vector b, i.e. bm = stdev(a1, a2, ..., am).
>
> It seem if I just use simple for loop, it will be very slow. It is also very difficult to vectorize.
>
> any good suggestion is greatly appreciated.

Why not use the square root of the cumulative variance?

meanx(1) = x(1);
varx(1) = 0;
meanx(n) = meanx(n-1) + ( x(n)-meanx(n-1) )/n;
varx(n) = varx(n-1) + ( ( x(n) - meanx(n) )^2 - varx(n-1) ) /(n-1);

If the cumulative mean and variance don't have to be stored,
this can be rewritten using scalar newmean,oldmean,newvar and oldvar.

Hope this helps.

Greg

Subject: how to write an effcient code for expanding stdev?

From: Greg Heath

Date: 12 Jan, 2010 08:55:18

Message: 16 of 18

On Jan 11, 4:38 am, "Jos (10584) " <#10...@fileexchange.com> wrote:
> "andy Lu" <luy...@gmail.com> wrote in message <hieela$8n...@fred.mathworks.com>...
> > say for a vector with n elements: a1, a2, ..., an.  I want to compute the expanding stdev.  and store in vector b, i.e. bm = stdev(a1, a2, ..., am).
>
> > It seem if I just use simple for loop, it will be very slow. It is also very difficult to vectorize.
>
> > any good suggestion is greatly appreciated.
>
> It is so easy to vectorize, as long as you know your math:
>
> % example data
>   m = 10 ;
>   a = rand(1,m) ;
>
> % engine
>   N = 1:m ;
>   s1 = cumsum(a) ;
>   s2 = cumsum(a.^2) ;
>   % sample standard deviation
>   SD0 = sqrt((N .* s2 - s1.^2) ./ (N .* (N-1))) ;
>   % normalized by N
>   SD1 = sqrt(N .* s2 - s1.^2) ./ N ;
>
> % check (be aware of round-off issues)
> for r=1:m,
>   disp([r std(a(1:r),0) SD0(r) std(a(1:r),1) SD1(r)]) ;
> end
>
> John D'Errico has also contributed something similar to the FEX:http://www.mathworks.com/matlabcentral/fileexchange/9428

This assumes that all m measurements are available at once.
When only measurements a(1:r) are available when stdev(r)
is required, quite a different approach is needed: Typically,
mean(a) and variance(a) updating formulae are used.

Hope this helps.

Greg

Subject: how to write an effcient code for expanding stdev?

From: Greg Heath

Date: 12 Jan, 2010 09:11:11

Message: 17 of 18

On Jan 12, 3:38 am, Greg Heath <he...@alumni.brown.edu> wrote:
> On Jan 11, 12:58 am, "andy Lu" <luy...@gmail.com> wrote:
>
> > say for a vector with n elements: a1, a2, ..., an.  I want to compute the expanding stdev.  and store in vector b, i.e. bm = stdev(a1, a2, ..., am).
>
> > It seem if I just use simple for loop, it will be very slow. It is also very difficult to vectorize.
>
> > any good suggestion is greatly appreciated.
>
> Why not use the square root of the cumulative variance?
>
> meanx(1) = x(1);
> varx(1)    = 0;
> meanx(n) = meanx(n-1) + ( x(n)-meanx(n-1) )/n;
> varx(n)    = varx(n-1) + ( ( x(n) - meanx(n) )^2 - varx(n-1) ) /(n-1);
>
> If the cumulative mean and variance don't have to be stored,
> this can be rewritten using scalar newmean,oldmean,newvar and oldvar.

I made the assumption that only x(1:n) is available when stdevx(n) is
required.

Hope this helps.

Greg

Subject: how to write an effcient code for expanding stdev?

From: andy Lu

Date: 12 Jan, 2010 16:20:28

Message: 18 of 18

Greg Heath <heath@alumni.brown.edu> wrote in message <2f5fe558-30a2-4919-962a-a083205a6500@k17g2000yqh.googlegroups.com>...
> On Jan 12, 3:38 am, Greg Heath <he...@alumni.brown.edu> wrote:
> > On Jan 11, 12:58 am, "andy Lu" <luy...@gmail.com> wrote:
> >
> > > say for a vector with n elements: a1, a2, ..., an.  I want to compute the expanding stdev.  and store in vector b, i.e. bm = stdev(a1, a2, ..., am).
> >
> > > It seem if I just use simple for loop, it will be very slow. It is also very difficult to vectorize.
> >
> > > any good suggestion is greatly appreciated.
> >
> > Why not use the square root of the cumulative variance?
> >
> > meanx(1) = x(1);
> > varx(1)    = 0;
> > meanx(n) = meanx(n-1) + ( x(n)-meanx(n-1) )/n;
> > varx(n)    = varx(n-1) + ( ( x(n) - meanx(n) )^2 - varx(n-1) ) /(n-1);
> >
> > If the cumulative mean and variance don't have to be stored,
> > this can be rewritten using scalar newmean,oldmean,newvar and oldvar.
>
> I made the assumption that only x(1:n) is available when stdevx(n) is
> required.
>
> Hope this helps.
>
> Greg

Thanks Greg. This is what I assume.

I will try your approach and compare with vectorization.

Tags for this Thread

No tags are associated with this thread.

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Contact us