Thread Subject: Minor disappointment with Matlab

Subject: Minor disappointment with Matlab

From: Dan

Date: 4 Sep, 2009 22:00:57

Message: 1 of 11

The other day, because some of my simulation code was taking a long
time to run, I spent some time trying to optimize the algorithm. This
was wasted time, because when I finally got around to profiling the
code, it seems that most of the time was spent in the Matlab "cross"
function. I thought this was such a basic function that I did not look
there for my problem. When I replaced the built-in function with my
own, the simulation was sped up by about a factor of 30.

Here is some representative code to illustrate:

(execution times and percentages from "profile" on the right)

for k=1:100000
    a = rand
(1,3); ..............................................................................
0.265s 2.7%
    b = rand
(1,3); ..............................................................................
0.125s 1.3%
    c = cross
(a,b); ..............................................................................
9.340s 94.2%
    d = [ a(2)*b(3)-a(3)*b(2), a(3)*b(1)-a(1)*b(3), a(1)*b(2)-a(2)*b
(1) ]; ......... 0.172s 1.7% (same as "cross")
end


I recommend that you carefully consider using "cross" in calculation
intensive code.

Regards,
Dan

Subject: Minor disappointment with Matlab

From: Bruno Luong

Date: 4 Sep, 2009 22:13:02

Message: 2 of 11

Dan <dnp037@yahoo.com> wrote in message <25f0b0e8-fa6b-4283-81eb-d3bfebeb4a97@c37g2000yqi.googlegroups.com>...

>
> I recommend that you carefully consider using "cross" in calculation
> intensive code.

Better still, vectorize the code. Most of Matlab functions has a lot of overhead because its design is not targeted for use within for-loop.

You can check the source code of cross. It is not a *built-in* function, simply a stock mfile.

Bruno

Subject: Minor disappointment with Matlab

From: James Tursa

Date: 4 Sep, 2009 22:23:02

Message: 3 of 11

Dan <dnp037@yahoo.com> wrote in message <25f0b0e8-fa6b-4283-81eb-d3bfebeb4a97@c37g2000yqi.googlegroups.com>...
> The other day, because some of my simulation code was taking a long
> time to run, I spent some time trying to optimize the algorithm. This
> was wasted time, because when I finally got around to profiling the
> code, it seems that most of the time was spent in the Matlab "cross"
> function. I thought this was such a basic function that I did not look
> there for my problem. When I replaced the built-in function with my
> own, the simulation was sped up by about a factor of 30.
>
> Here is some representative code to illustrate:
>
> (execution times and percentages from "profile" on the right)
>
> for k=1:100000
> a = rand
> (1,3); ..............................................................................
> 0.265s 2.7%
> b = rand
> (1,3); ..............................................................................
> 0.125s 1.3%
> c = cross
> (a,b); ..............................................................................
> 9.340s 94.2%
> d = [ a(2)*b(3)-a(3)*b(2), a(3)*b(1)-a(1)*b(3), a(1)*b(2)-a(2)*b
> (1) ]; ......... 0.172s 1.7% (same as "cross")
> end
>
>
> I recommend that you carefully consider using "cross" in calculation
> intensive code.
>
> Regards,
> Dan

To be fair, the built-in cross is vectorized, does argument checking, and allows the dimension used for the cross to be specified. Nevertheless, your point is well taken ... the difference between inline code and the built-in function for the simple 1x3 case is striking (R2008a, WinXP):

>> tic;crosstest;toc % built-in cross
Elapsed time is 11.885679 seconds.
>> tic;crosstest;toc % inline cross
Elapsed time is 0.373162 seconds.

James Tursa

Subject: Minor disappointment with Matlab

From: Bruno Luong

Date: 5 Sep, 2009 06:45:06

Message: 4 of 11

function t

n = 100000;
A = rand(3,n);
B = rand(3,n);

tic
for k=1:n
    a = A(:,k);
    b = B(:,k);
    c = cross(a,b);
end
toc % 7.201296 seconds.

tic
for k=1:n
    a = A(:,k);
    b = B(:,k);
    c = [ a(2)*b(3)-a(3)*b(2) a(3)*b(1)-a(1)*b(3) a(1)*b(2)-a(2)*b(1) ];
end
toc % 0.215036 seconds.

tic
C = cross(A,B);
for k=1:n
    c = C(:,k);
end % 0.078556 seconds.
toc

Subject: Minor disappointment with Matlab

From: Rune Allnor

Date: 5 Sep, 2009 10:35:42

Message: 5 of 11

On 5 Sep, 00:23, "James Tursa"
<aclassyguy_with_a_k_not_...@hotmail.com> wrote:
> Dan <dnp...@yahoo.com> wrote in message <25f0b0e8-fa6b-4283-81eb-d3bfebeb4...@c37g2000yqi.googlegroups.com>...
> > The other day, because some of my simulation code was taking a long
> > time to run, I spent some time trying to optimize the algorithm.
...
> To be fair, the built-in cross is vectorized, does argument checking, and allows the dimension used for the cross to be specified.

...all of which takes place at *run* time. A compiled
language would do a lot (all?) of that at *compile* time.

Rune

Subject: Minor disappointment with Matlab

From: Dan

Date: 8 Sep, 2009 14:44:45

Message: 6 of 11

On Sep 5, 1:45 am, "Bruno Luong" <b.lu...@fogale.findmycountry> wrote:
> function t
>
> n = 100000;
> A = rand(3,n);
> B = rand(3,n);
>
> tic
> for k=1:n
>     a = A(:,k);
>     b = B(:,k);
>     c = cross(a,b);
> end
> toc % 7.201296 seconds.
>
> tic
> for k=1:n
>     a = A(:,k);
>     b = B(:,k);
>     c = [ a(2)*b(3)-a(3)*b(2) a(3)*b(1)-a(1)*b(3) a(1)*b(2)-a(2)*b(1) ];
> end
> toc % 0.215036 seconds.
>
> tic
> C = cross(A,B);
> for k=1:n
>     c = C(:,k);
> end % 0.078556 seconds.
> toc

I appreciate your response.

However, you didn't check
C = [ A(2,:).*B(3,:) - A(3,:).*B(2,:); A(3,:).*B(1,:)-A(1,:).*B(3,:); A
(1,:).*B(2,:)-A(2,:).*B(1,:) ];

which is always faster.

Experienced users of Matlab know about the benefits of vectorization.
Old time users have had to do the vectorization trade-off, when too
much vectorization would cause the machine to use its virtual memory
thus actually increasing run time because of the operating system
memory swap. And, not every piece of code can be vectorized easily.

My point was that Matlab is a wonderful tool, because of the ease and
speed with which one can perform incredible amounts of calculations.
The set of functions available to the user is amazing and powerful.
When I use a "built-in* " function, I realize that it is most likely
not the most efficient way to implement my desired function, but I'll
accept the overhead because of the convenience.

My only surprise was the factor of 30 difference using "cross" even
for non vectorized code. If it had been a factor of 2 or 3, no big
deal. I only used "cross" because I was lazy and did not want to type
out the expression. Now, maybe I have been naive for many years, and
the same is true for most matrix/vector functions such as "dot", or
"norm". Or maybe because the majority of my experiences have been with
vectorized code, I had not noticed this phenomenon before. In any
case, I shall be very careful when using Matlab functions when the
code is not vectorized.

Regards,
Dan

*Your clarification of built-in functions is a distinction without a
practical difference.

Subject: Minor disappointment with Matlab

From: Bruno Luong

Date: 8 Sep, 2009 15:07:02

Message: 7 of 11

Where do you get the factor 30??? Here is my test result (2009B/Vista):

function testcross

n = 1e6;
A = rand(3,n);
B = rand(3,n);

tic
C = cross(A,B);
toc % Elapsed time is 0.173915 seconds.

tic
C = [ A(2,:).*B(3,:) - A(3,:).*B(2,:);
     A(3,:).*B(1,:)-A(1,:).*B(3,:);
     A(1,:).*B(2,:)-A(2,:).*B(1,:) ];
toc % Elapsed time is 0.168631 seconds.

%%%%%

If you look at the code of CROSS (it is *not* a built-in, there is a mfile- type CROSS, then open it using right mouse button menu to open it), it does exactly the same calculation like your code plus few overheads.

The factor 30 comes only when you use CROSS inside the for loop, which is a bad way of using it.

Bruno

Subject: Minor disappointment with Matlab

From: Bruno Luong

Date: 8 Sep, 2009 15:35:03

Message: 8 of 11

Sorry Dan, I misread your post; We agree that the factor 30 is when put cross inside the loop. If you follow more closely the newsgroup, a lot of threads where we discuss about "vectorization" for the very reason you run into. It is critical and difficult - thus required a lot practice I must admit - to design the right code in Matlab so that the speed is not screwed up.

Bruno

Subject: Minor disappointment with Matlab

From: James Tursa

Date: 8 Sep, 2009 18:25:05

Message: 9 of 11

"Bruno Luong" <b.luong@fogale.findmycountry> wrote in message <h85rum$p3t$1@fred.mathworks.com>...
>
> The factor 30 comes only when you use CROSS inside the for loop, which is a bad way of using it.

That statement is a bit too black-and-white for me. There can be several cases, such as simulations, where a cross product is needed in the formation of derivatives etc. that affect the very next iteration. It is impossible to vectorize the cross product calculations in this case over the entire simulation. That is precisely the case where a factor of 30 would not be tolerable and hand coding the calculations makes sense.

James Tursa

Subject: Minor disappointment with Matlab

From: James Tursa

Date: 8 Sep, 2009 18:37:04

Message: 10 of 11

Dan <dnp037@yahoo.com> wrote in message <37a033b3-1375-4593-b673-460ea2713401@x37g2000yqj.googlegroups.com>...
>
> I only used "cross" because I was lazy and did not want to type
> out the expression. Now, maybe I have been naive for many years, and
> the same is true for most matrix/vector functions such as "dot", or
> "norm". Or maybe because the majority of my experiences have been with
> vectorized code, I had not noticed this phenomenon before. In any
> case, I shall be very careful when using Matlab functions when the
> code is not vectorized.

I wouldn't call using the cross function lazy ... if a function is already available I would likely make the same assumption that you did, that it was reasonably coded and not worth my time to hand-code myself.

But dot, in particular, is a function I tend to avoid. In testing I have done, it uses a different algorithm (a simple loop) than a straight matrix multiply and is slower and somewhat less accurate. e.g.

>> a=rand(20000000,1)+rand(20000000,1)*i;
>> b=rand(20000000,1)+rand(20000000,1)*i;
>> format long
>> a'*b
ans =
     9.996446745265679e+006 -1.467960263183340e+003i
>> dot(a,b)
ans =
     9.996446745268498e+006 -1.467960262848279e+003i
>> tic;a'*b;toc
Elapsed time is 0.302599 seconds.
>> tic;dot(a,b);toc
Elapsed time is 0.774087 seconds.

Although it is not obvious from what I have posted, I ran similar calculations against a 100 decimal digit accurate calculation and the a'*b result was the more accurate of the two. Admittedly, the difference is in the trailing bits that your calculations shouldn't depend on, but it probably takes away one of the only reasons I would use dot in the first place since it is slower than a matrix multiply. I would probably only use dot in multi-dimensional cases and even then only if I couldn't easily reformulate it into a matrix multiply.

norm I have found to be pretty fast.

James Tursa

Subject: Minor disappointment with Matlab

From: James Tursa

Date: 8 Sep, 2009 18:52:02

Message: 11 of 11

"James Tursa" <aclassyguy_with_a_k_not_a_c@hotmail.com> wrote in message <h8688g$bm8$1@fred.mathworks.com>...
> >> tic;a'*b;toc
> Elapsed time is 0.302599 seconds.
> >> tic;dot(a,b);toc
> Elapsed time is 0.774087 seconds.

P.S. I should mention that the main reason for this speed difference seems to be that the BLAS calls behind the matrix multiply are multi-threaded (at least in later versions of MATLAB) whereas the simple loop used in the dot function is not.

James Tursa

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
vectorize Bruno Luong 5 Sep, 2009 02:49:03
rssFeed for this Thread
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com