arrayfun vs loops again

125 views (last 30 days)
David Young
David Young on 5 Apr 2024 at 22:41
Commented: Dyuman Joshi on 21 Apr 2024 at 10:19
I realise that the speed of arrayfun versus for-loops has been discussed before, but I remain puzzled by why the difference is as huge as it is in certain circumstances, and I wonder whether this is seen as an issue by the community or by MathWorks.
First an example - my questions are at the end. Define a simple class
classdef testclass
properties
p
end
methods
function tc = testclass
tc.p = pi;
end
end
end
a function that extracts a property value from a vector of objects of the class using a loop
function pvec = loopTest(tcvec)
pvec = double(size(tcvec));
for i = 1:length(tcvec)
pvec(i) = tcvec(i).p;
end
end
and one that does the same using arrayfun.
function pvec = arrayfunTest(tcvec)
pvec = arrayfun(@(t) t.p, tcvec);
end
and then run some timing code:
tc = testclass;
tcvec = repmat(tc, 1, 1000);
tloop = timeit(@() loopTest(tcvec));
tarrayfun = timeit(@() arrayfunTest(tcvec));
fprintf("arrayfun takes %f times as long as a loop\n", tarrayfun/tloop);
which on my Windows PC running R2024a, prints (typically)
arrayfun takes 52.565968 times as long as a loop
A factor of 50! This effect accounts for some of the remarkably long run times that were plaguing my app.
One standard explanation of the difference is that arrayfun has to do an extra function call, so let's see if that's what is causing this, by putting an extra function call into the loop:
function pvec = loopFunTest(tcvec)
pvec = double(size(tcvec));
f = @(t) t.p;
for i = 1:length(tcvec)
pvec(i) = f(tcvec(i));
end
end
and repeating the timing exercise:
tc = testclass;
tcvec = repmat(tc, 1, 1000);
tloopfun = timeit(@() loopFunTest(tcvec));
tarrayfun = timeit(@() arrayfunTest(tcvec));
fprintf("arrayfun takes %f times as long as a loop with a function\n", tarrayfun/tloopfun);
which now prints typically
arrayfun takes 6.163437 times as long as a loop with a function
So yes, the extra function call does have an effect. But there's still a factor of 6 between the loop and arrayfun - that's still huge, and enough to make arrayfun more or less useless. This a pity - more than a pity, a serious nuisance - as I like to use it, and my code is littered with examples as it was a natural go-to before I discovered this problem.
I've tried to make my examples clear enough to illustrate the core of the problem. My two questions are these:
  1. Out of curiosity, how do you make arrayfun run 6 times more slowly than a loop? I realise, of course, that ultimately it is implemented as a loop, but that only means we might not expect it to be a lot faster. But even the most naive implementation should allow it to be about as fast as the loop - so what is going on to produce the spectacular slowdown?
  2. What are the factors that interact with arrayfun to make it slow? Without doing masses of tests, how can I know which parts of my code need to be rewritten as loops? A sense of when it's slow and when it isn't would be really useful.
  6 Comments
David Young
David Young on 8 Apr 2024 at 21:53
Thank you @Steven Lord. I hadn't thought about diagnostics, and yes, the arrayfun error message is much more helpful than what my own code provides. (In fact, what my implementation says in your example is just "Unable to perform assignment because the left and right sides have a different number of elements." I'd have to put in some work to provide the detail, and there would be some effect on performance.)
However, it's a trade-off, and I'd rather have the faster execution than the more informative message in this case - but I agree it's by no means obvious that will always be the right choice.
As you also point out, the amount of time taken by the function affects the significance of the loop overhead. My very minimal function perhaps unfairly emphasised the cost of the overhead, but I address this to some extent below.
Having talked about a simple loop-based reimplementation, I thought I ought to put my money where my mouth is, so my version of arrayfun, called applyFunction, is attached. It doesn't have all the functionality of arrayfun (for example it doesn't provide for more than one result) though I think most of it could be added without significant overhead.
This makes it easier to do speed comparisons. Also attached is a modified version of my test class which has a method that provides a more substantial computational load for each function call, so it's a little more realistic as I'm not timing pure overhead. Here's an example, with a bit more computation going on than before:
tc = testclass;
tcvec = repmat(tc, 1, 1000);
nrep = 1000; % controls amount of work each function call does
t1 = timeit(@() arrayfun(@(t) t.slowdown(nrep), tcvec));
t2 = timeit(@() applyFunction(@(t) t.slowdown(nrep), tcvec));
fprintf("arrayfun takes %f times as long as applyFunction\n", t1/t2);
which for me typically gives a factor of 3 in speed, for example:
arrayfun takes 2.958038 times as long as applyFunction
As you'd expect, the time ratio falls towards 1 if nrep is increased. On the other hand it climbs to 8 or 9 as it's decreased. But that case matters: I hit the problem precisely because my application was doing a simple thing to a large array of objects.
Bottom line: I still think it's worrying that my simple applyFunction (entirely in user-level MATLAB code) can go so much faster than MATLAB's arrayfunction when iterating over an object array.
Dyuman Joshi
Dyuman Joshi on 16 Apr 2024 at 15:07
"simply because that implementation would be just to execute the exact same loop inside arrayfun."
"The overhead factors you mention surely apply equally to the loop and arrayfun, "
That is not is what being done in the for loop though.
@Voss has provided an example of running a loop with a function handle and calling it in every iteration, and as you can see it is slower than the simple for loop.
"... , which you won't notice."
That is not true. There is always some overhead to calling a function, which as I mentioned depends on the function being called.

Sign in to comment.

Answers (1)

Joss Knight
Joss Knight on 14 Apr 2024 at 20:14

I wish it were a cleverer answer, but I'm afraid that it's simply that MATLAB has been heavily optimized for for loops over the years but the same optimizations have not been applied to arrayfun. In this case by far the most important optimization will be multithreading, so perhaps you have 32 virtual cores to work with.

It's tempting to just convert the arrayfun implementation to a for loop internally but as other have implied, the devil is in the detail since it would take some effort to get identical, backwards compatible behaviour.

Best just to think of arrayfun as syntactic sugar only to be used for performance non-critical situations. Or for the GPU, where it has a special implementation.

  3 Comments
Joss Knight
Joss Knight on 16 Apr 2024 at 15:38
There are many. For instance, I don't know, when constructing a singleton object you want to iterate over its arguments in a neat concise way that makes your code easier to read like
arrayfun(@mustBeTextScalar, varargin(1:2:end));
arrayfun(@mustBeNumeric, varargin(2:2:end));
Sometimes inefficiency is genuinely less important than readability. But admittedly, you need to be cautious.
Dyuman Joshi
Dyuman Joshi on 21 Apr 2024 at 10:19
"Sometimes inefficiency is genuinely less important than readability."
I guess I'll understand this when I face it myself ¯\_(ツ)_/¯

Sign in to comment.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Products


Release

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!