Path: news.mathworks.com!not-for-mail
From: "Nike Dattani" <dattani.nike@gmail.com>
Newsgroups: comp.soft-sys.matlab
Subject: Re: New challenge! Minimizing the amount of 'unnecessary memory' !
Date: Thu, 24 Mar 2011 01:50:19 +0000 (UTC)
Organization: Oxford University
Lines: 23
Message-ID: <ime80q$ndv$1@fred.mathworks.com>
References: <ilbujg$rbl$1@fred.mathworks.com> <ilj63l$kr6$1@fred.mathworks.com> <iljv48$pkf$1@fred.mathworks.com> <ilka8g$j0e$1@fred.mathworks.com>
Reply-To: "Nike Dattani" <dattani.nike@gmail.com>
NNTP-Posting-Host: webapp-03-blr.mathworks.com
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: fred.mathworks.com 1300931419 23999 172.30.248.38 (24 Mar 2011 01:50:19 GMT)
X-Complaints-To: news@mathworks.com
NNTP-Posting-Date: Thu, 24 Mar 2011 01:50:19 +0000 (UTC)
X-Newsreader: MATLAB Central Newsreader 2479384
Xref: news.mathworks.com comp.soft-sys.matlab:717788

Hello!
Thanks once again for your very useful MEX routines!

> Array slicing in Fortran simply means that the compiler will write the for loops (or do loops) in the background for you, and will most likely do it in a manner that is efficient for memory access. But it doesn't necessarily mean that it will run any faster than if you write the for loops (or do loops) manually.
> 
> James Tursa

Thanks very much for your insight!
If that's true I should probably just have done the whole thing using for loops (like the one Roger Stafford suggested in his 2nd post) right from the beginning. The reason why I was working with arrays (which of course take up much more memory than Roger Stafford's for loop) was that I always thought that:

A=A.*B 

Is much faster than:

for i from 1:4^13
A(i)=A(i)*B(i)
end

since the internal funciton for point-wise multiplication of arrays would hopefully have been highly optimized for memory management, multi-threading, multiplying several rows in parallel rather than one by one, etc .. 
In principle a compiler could try to optimize the for loop similarly, but since every for loop is slightly different, I would be surprised if doing array-wise multiplication wouldn't be faster.

Has anyone done a study with a benchmark comparison of these two approaches ?