12 views (last 30 days)

I find it strange that no FEX contributors have tried to improve upon ACCUMARRAY, since I think it's widely agreed that it is slow.

I've speculated that the reason ACCUMARRAY is so slow (despite being a built-in) is that it has to support so many different options, and was considering trying to make a customized mex version, one which supported only summation accumulation, for example.

But before I pursue that, I was wondering if there's an apriori obvious reason why it would or would not make a difference.

Sean de Wolski
on 13 Nov 2012

Edited: Sean de Wolski
on 13 Nov 2012

How fast do you want it?

val = rand(3000,1);

idx = ceil(rand(3000,3)*10);

t = 0;

for ii = 1:100;

tic;

V = accumarray(idx,val,[10 10 10],@prod);

t=t+toc;

end

t./100

On my laptop:

ans = 0.0022

That's pretty good for accumulating 3000 values in 3d space. Bumping it up to 10000 rows it only take 0.0027 seconds...

~An accumarray fan

Mark Whirdy
on 13 Nov 2012

Edited: Mark Whirdy
on 13 Nov 2012

Hi Matt

I've never used accumarray for this reason (its very generalised so adds much overhead), and haven't yet come across a data-manipulation problem that couldn't be solved more efficiently without it (using matrix/"linear" indexing and the very handy ismember.m function [which has the mex "imembc" as its engine] ).

Can you give us an example of your specific use-case as my feeling is that pursuing a mex of accumarray will be a bit of a wild goose chase.

Best, Mark

Oliver Woodford
on 20 Nov 2013

Walter Roberson
on 14 Nov 2012

I worry about the cost of the memory allocations within accumarray(). There are some operations such as the default @sum or @max that can be done as a running total, but operations such as @median have to build up each entry as a list of values, each list of unknown size. The same is true for ugly but useful constructs such as @(x) {x} that return the (in-order) list of actual values.

Does accumarray do some kind of estimation of buffer sizes and pre-allocates each, growing (with some strategy?) at need and shrink-fitting later if need be? Or does accumarray do a prepass equivalent to

accumarray(index, 1)

to count the entries that will end up at each location, use that to allocate memory, and then do the real accumarray ? In some cases, pre-allocation can be done just as

t = max(index); if any(size(t,2:end) > 1); t = [t 1]; end

array = zeros(t, class(TheData));

but accumarray doesn't know in advance that it can use that, not unless the function was defaulted, or perhaps if accumarray somehow compares the function handles to @sum, @min, @max, and other functions that can be invoked iteratively to give the scalar result.

Thus it seems plausible to me that there is room for optimization of accumarray() around issue of memory allocation. For example there could be a flag saying to invoke the function iterative (which would need also an initial value.) Or perhaps the user could pass in a buffer size which was the maximum number of elements to expect per location. Or a re-buffer size (e.g., when the end of the buffer is reached, add N more elements to the buffer.)

Jan
on 14 Nov 2012

The built-in command cellfun contains some procedure, which avoid to call Matlab to process a function handle. These procedure can be triggered by using strings as 1st input as explained in the documentation. These procedures are implemented in the C-Mex file of cellfun, whose source code has been shipped with older versions of Matlab.

Unfortunately these very efficient methods are marked as beeing supported for backward compatibility only now and most of the users in this forum post much slower methods using anonymous functions. I do not stop to advertise the efficient methods here. But I'm not motivated to implement an improved accumarray as Mex file, when this is not really wanted and supported for the already existing cellfun.

Oleg Komarov
on 14 Nov 2012

@Matt: I think Simon addressed "I find it strange that no FEX contributors have tried to improve upon ACCUMARRAY".

My interpretation is: by analogy, optimized calls to cellfun with first input being string would have been discarded if not for backward compatibility, thus what is the benefit of exerting effort for improving on accumarray when the general trend is to disregard speed (for clarity).

In simple words, why would I (FEX contributor) spend effort in accumarray which is already seldom used? It's up to TMW to promote certain functions, and in general certain approaches.

Jan
on 14 Nov 2012

Opportunities for recent engineering grads.

Apply TodayFind the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
## 4 Comments

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/53613-speed-optimization-of-accumarray-via-mex#comment_111036

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/53613-speed-optimization-of-accumarray-via-mex#comment_111036

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/53613-speed-optimization-of-accumarray-via-mex#comment_111050

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/53613-speed-optimization-of-accumarray-via-mex#comment_111050

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/53613-speed-optimization-of-accumarray-via-mex#comment_111150

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/53613-speed-optimization-of-accumarray-via-mex#comment_111150

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/53613-speed-optimization-of-accumarray-via-mex#comment_111159

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/53613-speed-optimization-of-accumarray-via-mex#comment_111159

Sign in to comment.