Reading out vector by index matrix too slow?

2 views (last 30 days)
Hey folks, I'm trying to implement a 3D acoustic source localization algorithm in real-time. Unfortunately, I've to handle a lot of data, i.e., big matrices. To avoid this problem and to avoid running out of memory, I generally work with index-matrices or -vectors. Right now, a special index-operation is the weak point of my implementation regarding processing time. The problem is as follows:
for k=1:270
... % Some code for block-processing (processing time negligible)
for j=1:5
for i=1:70
(1) tmp = sum( DataArray( idxCell{i,j}{1} ) , 2);
(2) ErgArray(:,i) = ErgArray(:,i) + tmp( idxCell{i,j}{2} );
end
end
end
I've pre-allocated memory for 'ErgArray'. I have to do the computations with a lot of data (which is already optimized):
size(tmp1) -> 142 x 1 (single)
size(idxCell{i,j}{1}) -> 142 x 5 (uint16)
size(idxCell{i,j}{2}) -> 51120 x 1 (uint16)
size(ErgArray) -> 51120 x 70 (single)
If I eliminate the second for-loop by vectorization, the whole computation lasts much longer. According to profile and tic/toc, (1) requires 4.8 s, and (2) 19.6 seconds. I work in single-core mode, i.e., using -singleCompThread. The whole process should should take ab to (at most) 3 seconds. Using mex-files doesn't reduce the processing time.
By the way: idxCell{i,j}{1} contains specific indices. Using this cell avoids using complicated filters, i.e., very (!) big matrices and for-loops.
Have you ever encountered such problem? Why does vectorization take more time than using an additional for-loop? Does a more efficient computation exist? If yes, which one? Is this way of filtering inefficient?
PS: Computer-Info: 1) AMD FX(tm)-8150 Eight-Core Processor 2) 1.4 GHz 3) 2048 kB Cache size 4) 15 GB RAM

Accepted Answer

Jan
Jan on 4 Apr 2013
In general, vectorized methods require more time, when they have to create large intermediate arrays, because allocating memory can consume a lot of time.
What exactly does "big" mean? Some users of this forum are impressed by 100 elements already, others think, that everything, which matchs neatly in the 192GB RAM is not "big".
Nested cells are usually not efficient. What is the contents of idxCell{i,j}{2}? If they are index vectors, logical indexing could be more efficient.
You are calculating sum(DataArray(idxCell{i,j}{1}), 2) in each of the 270 iterations over k again. Wouldn't it be faster, if you avoid such repeated computations?
  4 Comments
Hannes Pessentheiner
Hannes Pessentheiner on 4 Apr 2013
And I forgot to mention, that I have to use the sum in the innermost loop, because k represents the frame index; and I have to calculate the sum for each time-frame (I cannot pull it in front of the innermost loop).
Jan
Jan on 4 Apr 2013
Edited: Jan on 4 Apr 2013
Yes, logical indexing works only if no repetitions are required.
Instead of idxCell{:,:}{2} the unnested idxCell{:,:,2} could be faster, but I do not expect this to be significant here.
If the index vector contains repetitions and you want to calculate the sum only, a dot product might be faster:
x = 1:5;
idx = [1,1,2,2,3,4,4,5];
r1 = sum(x(idx))
r2 = x * [2; 2; 1; 2; 1]; % Summed implicitly
This would avoid to create larger intermediate arrays which contain redundant information. Here the vector for the dot product is the "runlength" of idx - I mention this, because I'm going to publish a fast runlength-Mex this week in the FEX...

Sign in to comment.

More Answers (0)

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!