# Three ways to do the same integration leading to three different speeds: why?

1 view (last 30 days)
Theo on 8 Nov 2014
Edited: Theo on 16 Jun 2015
Here is a code that does the trapezoid rule applied to a function on [0, 100] and where I use a random vector with 1000 entries. The integration is repeated 1 million times. The output gives my computer 3 seconds for the first method, 9 seconds for the second, and 11.5 seconds for the third.
But the only difference between the third and the first is that the third method puts the calculation into a function. The difference between the second and the first is the vector notation. I don't understand why these are leading to such different efficiencies. Moreover, I am worried about the fact that putting an identical script into a function multiples the time by a factor of three!
Here is the setup:
x = linspace(0, 100, 1000);
dx = x(2) - x(1);
y = rand(size(x));
First method:
G = 0;
tic
for j = 1:1000000
g = 2*y;
g(1) = g(1)/2;
g(end) = g(end)/2;
G = G + dx/2*sum(g);
end
toc
disp(['RUN 1: G = ', num2str(G)]);
Second method
G = 0;
tic
for j = 1:1000000
G = G + dx/2*sum([y(1) 2*y(2:end-1) y(end)]);
end
toc
disp(['RUN 2: G = ', num2str(G)]);
Third method:
G = 0;
tic
for j = 1:1000000
G = G + mytrapz_equal(dx, y);
end
toc
disp(['RUN 3: G = ', num2str(G)]);
The same function as Run 1:
function F = mytrapz_equal(dx, y)
g = 2*y;
g(1) = g(1)/2;
g(end) = g(end)/2;
F = dx/2*sum(g);
end
end
Can anybody shed some light?

Mohammad Abouali on 8 Nov 2014
Calling a function has some overhead. So you are adding that overhead 1,000,000 times and that's why it takes longer.
The same goes for second method, You are calling sum function so relative to first implementation you are adding the function call overhead 1,000,000 times to the total time.

Show 4 older comments
Theo on 8 Nov 2014
Mohammad: well, yes I've learned an important lesson. I think up until now, I've always taken the preference to re-use even very simple functions like this, thinking that I wouldn't be sacrificing any speed.
> What MATLAB version do you use?
I'm using R2013a. I'm really surprised at the differences between our computation times. Are you using 2014a/b?
Mohammad Abouali on 8 Nov 2014
Yes, I am using 2014b.
This is not confirmed, but I have noticed that some of my codes are running faster on R2014b relative to R2104a. There is a specific code that I recall it was taking about 43 seconds on average and now it is taking about 30 seconds. Although as I said, this is not confirmed. This could be due to many other things. I never had time to really check this.
Theo on 10 Nov 2014
Thank you for the interesting discussion. I will try and grab a newer version of Matlab from my department.

### More Answers (2)

Roger Stafford on 8 Nov 2014
As to the difference between method 1 and 2, for each of the million trips through the loop method 2 has to construct and then abandon a new 1000-element array, namely [y(1) 2*y(2:end-1) y(end)], which it then hands to 'sum' and that takes allocation time that doesn't occur in method 1. That's just a guess on my part.
You should realize that accounting for timing with differing though equivalent code is fraught with uncertainty. First the Matlab language must be translated into C (I presume) and then on your computer the C is translated via a compiler into machine language appropriate for your computer, and strange and illogical things can happen to the timing in the process. A lot depends on the particular decisions the compiler writers made in coding this translation.

Theo on 16 Jun 2015
Edited: Theo on 16 Jun 2015
This is a later 'answer' added to complement the above answers about the overhead in concatenating arrays and modularizing via functions. There is a very nice and succinct article here about these very issues:
It can also be found in Chapter 10 of this github rep:
https://github.com/paulgribble/SciComp