Why is code faster when broken across multiple lines?
6 views (last 30 days)
Show older comments
Trying to speed up some code, so I used Wolfram Alpha to find an analytical form of what I was previously computing as a numerical integral. Curiously, the integral goes way faster when I break the code up. I have something of a reputation at work for knowing how to speed up other people's Matlab code, so I'd like to understand why i0 takes so much longer to compute than i (and l1-l5):
i=indefiniteIntegral(1/3,100*rand(1e6,2)); % <-- run this in the profiler
function i = indefiniteIntegral(a,x)
E1 = a*sqrt(x.^2+1);
E2 = 2*a-1;
E3 = sqrt(E2);
E4 = (a-1)*x;
l1 = -a^2*log(E1+E4+E3);
l2 = -a^2*log(E1-E4+E3);
l3 = a^2*log(-2*a*x+a*E3-E3+x);
l4 = a^2*log(2*a*x+a*E3-E3-x);
l4b = -2*a^2*atanh((E3*x)/(a-1));
l5 = 2*E3*E1+2*E3*a*x-2*E3*x;
i=.5*E2^(-3/2)*(l1+l2+l3+l4+l4b+l5);
i0=.5*E2^(-3/2)*(-a^2*log(E1+E4+E3)-a^2*log(E1-E4+E3)+a^2*log(-2*a*x+a*E3-E3+x)+a^2*log(2*a*x+a*E3-E3- x)-2*a^2*atanh((E3*x)/(a-1))+2*E3*E1+2*E3*a*x-2*E3*x);
% assertEqual(i,i0,1e-4); % My own function for checking equality within a tolerance, and i is equal to i0 in this case
% Reference: http://www.wolframalpha.com/input/?i=integrate+1%2F%281-a*%281-x%2F%28sqrt%281%2Bx%5E2%29%29%29
Is this an artifact of profiling or something? My typical usage, in case it matters, is that I'm given a bunch of values in x, and I run:
i = diff(abs(indefiniteIntegral(a,[-x,0])),[],2);
All the math seems to work, I mean I get the correct values compared to numerical integration, I just want to understand why the code runs faster when I split it up (which I did only for readability at first).
0 Comments
Accepted Answer
Matt J
on 5 Nov 2014
Edited: Matt J
on 5 Nov 2014
Like Sean, I imagine the JIT accounts for the difference in speed that you are seeing. In general, however, it usually is better to break up the computation, because it spares you the inefficiency of re-computing expressions that appear in the closed-form expression multiple times.
For example, your decomposition into l1...l5 could be made even more efficient, if you pre-computed vector quantities like 2*a*x and E3*x and re-used them instead of re-computing them in several places, as you are doing currently.
1 Comment
Matt J
on 5 Nov 2014
Edited: Matt J
on 5 Nov 2014
Incidentally, i0 is also not factoring the expression in an optimal way. Because most of the (vector) terms involve a scalar multiplication with a^2, breaking up the computation term-by-term and saving the multiplication with a^2 and other scalars until last, like below, should reduce the computation somewhat,
l1 = log(E1+E4+E3);
l2 = log(E1-E4+E3);
l3 = log(-2*a*x+a*E3-E3+x);
l4 = log(2*a*x+a*E3-E3-x);
l4b = atanh((E3*x)/(a-1));
c= i=.5*E2^(-3/2);
i=(c*a^2)*(l3+l4-l1-l2) - (2*c)*l4b;
You could also avoid the multiple calls to log() by writing as the log of a product instead of a sum of logs.
More Answers (1)
Sean de Wolski
on 5 Nov 2014
The JIT accelerator is sometimes able to better optimize code that has been split into pieces.
2 Comments
See Also
Categories
Find more on Logical in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!