Using subrange makes array operations slow

Question

Clayton Davis on 18 Dec 2014

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/167099-using-subrange-makes-array-operations-slow

Edited: Matt J on 20 Dec 2014

This is a question about the performance of code which I would naively expect to be much better. I define two random arrays:

a = randn(1, 10000);
b = randn(10000, 1);

Then try to multiply them together in two ways:

timeit(@() a*b)
ans =
   4.3080e-06
timeit(@() a(1:10000)*b(1:10000))
ans =
   1.5634e-05

You can see that when I use a sub-range of the arrays, multiplication gets about 3.6x slower, even in the case when the subrange is the whole array.

I could imagine that there is some overhead in parsing the subrange part of the command; but when the vectors become longer, the "overhead" grows:

a = randn(1, 1000000);
b = randn(1000000, 1);
timeit(@() a*b)
ans =
    0.0010
timeit(@() a(1:1000000)*b(1:1000000))
ans =
    0.0062

So, I have the two-fold question:

1) As a matter of curiosity, why is the performance so much worse when I use sub-ranges?

2) When we wish to perform read-only operations on sub-ranges, what can we do (besides copying the array) to keep performance reasonable?

Thanks, Clayton

PS. Version information: MATLAB Version: 8.3.0.532 (R2014a) Operating System: Microsoft Windows 7 Ultimate Version 6.1 (Build 7601: Service Pack 1) Java Version: Java 1.7.0_11-b21 with Oracle Corporation Java HotSpot™ 64-Bit Server VM mixed mode

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Matt J on 18 Dec 2014

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/167099-using-subrange-makes-array-operations-slow#answer_162778

Edited: Matt J on 19 Dec 2014

1) As a matter of curiosity, why is the performance so much worse when I use sub-ranges?

When you read a sub-array on the right hand side of a statement, a temporary variable holding a copy of the sub-array is made internally. E.g., the command

c=sum(a(1:N))

creates a temporary variable to hold [a(1),a(2),...,a(N)] occupying its own block of memory (separate from the memory occupied by the complete vector a) which is then passed to the sum() command. This mechanism makes it so that the sum() command doesn't have to know anything about what larger array a(1:N) belongs to in order to do its work. It just handles it like any other input variable. However, there is time and memory overhead for the copying, of course.

For left-hand side vectorized assignment, e.g.,

a(1:N)=1

copies are not made.

2) When we wish to perform read-only operations on sub-ranges, what can we do (besides copying the array) to keep performance reasonable?

Use a MEX file. In C-code, you can do all the in-place computations you want.

1 Comment
Show -1 older commentsHide -1 older comments

Matt J on 19 Dec 2014

Edited: Matt J on 20 Dec 2014

I also seem to remember that in early versions of MATLAB, in the expression

c=sum(a(1:N))

not only would the data [a(1),a(2),...,a(N)] get copied, but also the expression 1:N would internally generate a temporary index vector N elements long. I.e., it was equivalent to doing the following

      indices=1:N;
      c=sum(a(indices));

which of course has further time and memory overhead due to the creation of "indices".

Sign in to comment.

Using subrange makes array operations slow

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Products

Community Treasure Hunt

Using subrange makes array operations slow

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (0)

See Also

Categories

Tags

Products

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments