Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

Bad performance when setting first column element of a matrix

Asked by Jan on 11 Oct 2011

I have a loop in which I read a lot of data and store it in a matrix, like this:

l = numel(c_d); % c_d is data to be stored
typeData.data{j}(row, 1:l) = c_d; % row and j have been determined earlier

The performance of this seemed bad, so I tried pre-allocating the matrix stored in typeData.data{j}. This did not seem to matter so I tried the following: (results of profiling in comments)

l = numel(c_d); % c_d is data to be stored
c_d = double(c_d); % c_d might be a uint8 so cast to double to make sure that doesn't matter.
temp = typeData.data{j};
[r, c] = size(temp);
if cnt > r || l > c
  disp('Not allocated'); % Check on allocation. Never hit during profiling!
end
% c_d is a 4 element vector
if l == 4
  temp(row, 1) = c_d(1); % Takes 1 to 1.5 s in profiler
  temp(row, 2) = c_d(2); % Takes < 0.01 s
  temp(row, 3) = c_d(3); % Takes < 0.01 s
  temp(row, 4) = c_d(4); % Takes < 0.01 s
    local(r, c) = NaN(r, c); % Pre allocate a local matrix
    local(row, 1) = c_d(1); % Takes < 0.01 s
    local(row, 2) = c_d(2); % Takes < 0.01 s
    local(row, 3) = c_d(3); % Takes < 0.01 s
    local(row, 4) = c_d(4); % Takes < 0.01 s
  else
    typeData.data{j}(row, 1:l) = c_d;  
  end
typeData.data{j} = temp;

This loop is run about 13000 times in the above example. Setting the first column element takes the most time.

I'm wondering why setting the first column element takes so much time. It seems something in the structure of the temp matrix is different than the local matrix, but I have no idea what it can be. According to the debugger, both local and xoop are equal sized matrices using doubles. Can someone shed some light on this?

Remarks:

  • The matrix stored in typeData.data{j} is pre-allocated using NaNs, but using something else, like zeros, does not matter.
  • Not Pre-allocating the matrix stored in typeData.data{j} does not give a (significant) difference for performance. Pre-allocating a much larger matrix degrades the performance.
  • Obviously there is more code around this, but I tried to keep this post small by not pasting all of it ;) Please ask about it!

0 Comments

Jan

Products

No products are associated with this question.

1 Answer

Answer by Jan Simon on 11 Oct 2011
Accepted answer

This is the expected behaviour.

temp = typeData.data{j};

Now temp is a shared data copy or the j.th cell element. This means, that temp has an own header, but shares the data with the cell element.

temp(row, 1) = c_d(1); % Takes 1 to 1.5 s in profiler

If you modify temp, a deep data copy is created at first. This means that the data are duplicated and the modification is inserted in the new array. Of course, this is time consuming.

temp(row, 2) = c_d(2); % Takes < 0.01 s

Now only an element of temp is changed, which is fast.

As you see, pre-allocating the elements of a cell is not helpful, but wastes memory. Only pre-allocate the cell itself and create the data at once instead of copying them.

This should be fast even with pre-allocation, because it writes directly to the already reserved memory:

typeData.data{j}(row, 1:l) = c_d;

You can use this for further investigations:

format debug

Now you see the data pointer pr: In case of a shared data copy the pointer remains the same, for a deep copy you get a new pointer.

1 Comment

Jan on 12 Oct 2011

Not sure I wanted to know that ;) , but it clearly explains the situation. It also explains why |typeData.data{j}(row, 1:l) = c_d| is slow, because typeData is itself a shared copy. Thanks a lot!

Jan Simon

Contact us