MATLAB Answers

OOP-Performance problems in accessing large arrays in class properties

28 views (last 30 days)
Simon
Simon on 7 Nov 2013
Commented: Matt J on 5 Dec 2013
Hi!
I have a problem with accessing arrays that are properties of a class. I attached a sample class. In this class just one property is defined, an array of size(N, 3). The variable "N" may be set to different values. The problem is: for large N it takes a long time to set some values somewhere in the matrix (it is prealloated with zeros for testing purposes).
The process of testing is
clear all
close all
clc
% create instance
T = TimingTest;
% test for array of 1000x3
T.Resize(1e3);
T.DoTesting;
% test for array of 100000x3
T.Resize(1e5);
T.DoTesting;
I cannot see why it takes so long or what I might have done wrong. Hints are very welcome. Thanks in advance!

Answers (3)

Yair Altman
Yair Altman on 1 Dec 2013
This is because you are inadvertently reallocating tens of thousands of elements, 1e4 times, in the following line:
ctmp(ind, 1:3) = value;
notice that ctmp was previously set to
ctmp = obj.Arr;
which is done using copy-on-write, i.e. only copying a pointer (reference) whereas in that line you are modifying 3 values so the entire array (100Kx3x8=2.4MB) needs to be allocated before it can be modified. You are then repeating this 1e4 times.
So the problem is not so much with OOP performance as with your inefficient memory reallocations.
  2 Comments
Matt J
Matt J on 2 Dec 2013
No, you will see the same slow speed even if you use a non-handle class. You need to make sure, however, that you are correctly resizing T, with the modified syntax
T=T.Resize(1e3);
T=T.Resize(1e5);

Sign in to comment.


Matt J
Matt J on 4 Dec 2013
Edited: Matt J on 4 Dec 2013
But apparently it takes the same time to modify the property directly with obj.Arr(ind, 1:3) = value;
This is interesting (and unfortunate), but I believe I know why it happens. When you modify property data, the entire array-valued property contents are always first pre-processed and returned via the property's get() method. After that, indexing operations are applied to the get method's output and the array is modified. Then the modified data is put back into the property, but pre-processed with the property's set method. In your case, you haven't defined a get.Arr() and set.Arr() methods, but default ones are provided in the background.
Because the pre- and post-processing done by set.Arr and get.Arr are limited only by the class designer's imagination, MATLAB cannot simply take specific array elements out and put modified ones back in in an in-place manner. It has no choice but to implement the expression obj.Arr(ind, 1:3) = value with the equivalent of
ctmp = obj.Arr; %get.Arr() called here
ctmp(ind, 1:3) = value;
obj.Arr = ctmp; %set.Arr() called here
Thus a second deep copy of Arr is made before obj.Arr is updated. It's too bad and hopefully future releases will offer property Attributes that can let you circumvent set/get. containers.Map don't have set/get methods as middle men (I don't think) and so they don't have this problem.
Note, however, that this slow access only occurs for operations that modify properties. If you simply did a subsref operation to access part of obj.Arr
value = obj.Arr(ind, 1:3) ;
you would not see this slow behavior.
  2 Comments
Matt J
Matt J on 5 Dec 2013
MATLAB® has no default set or get property access methods.
Puzzling. But even if no set/get methods are called, I can still imagine the developers using the same copy semantics as if there were set/get methods defined. It would make coding simpler, I'd guess.
But I have to admit that I use matlab version 7.11.2.1031 (R2010b) Service Pack 2 at the moment. So I don't know how the current release performs.
I'm seeing the same behavior in R2013a.

Sign in to comment.


Simon
Simon on 7 Nov 2013
To answer myself: It seems that this problem occurs only if the class is a handle class ...
  4 Comments
Simon
Simon on 4 Dec 2013
But apparently it takes the same time to modify the property directly with
obj.Arr(ind, 1:3) = value;
and to modify it in a copy of the property with
ctmp = obj.Arr;
ctmp(ind, 1:3) = value;
obj.Arr = ctmp;
This is measured in "timing for property array" vs. "timing combined loop". This is independent of the type (value/handle) of class.
I'm interested in the best (meaning: most efficient) way to handle this. That's the reason why I ask this kind of pedantic questions. There are situations where the computational time matters. Something else I discovered during my investigations is that the time it takes to store the value increases as the array gets filled. This supports your statement about copyiing the array. Copying zeros seems to be more efficient. Interestingly enough I cannot see a comparable slowing down if I use containers.Map objects.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!