195 views (last 30 days)

Why does Matlab suggest that I shouldn't use square brackets unless absolutely necessary? For example, if I type

x = [1:10];

the default code-checking feature suggests "Use of brackets [] is unnecessary. Use parentheses to group, if needed." I realize that the square brackets here are not necessary, but is there some cost to including them? Are parentheses more computationally efficient?

Jan
on 26 Jan 2017

Edited: Jan
on 26 Jan 2017

The square brackets impede the boundary checks when the vector is used for indexing:

x = zeros(1, 10000);

tic

for k = 1:10000

x(1:k) = k;

end

toc

x = zeros(1, 10000);

tic

for k = 1:10000

x([1:k]) = k; % <- [...] added

end

toc

tic

for k = 1:10000

v = 1:k;

x(v) = k;

end

toc

Elapsed time is 0.087256 seconds.

Elapsed time is 0.374184 seconds. !!! Factor 4 !!!

Elapsed time is 0.375447 seconds.

When Matlab accesses an array element, it has to check, if the index is inside the allowed range: > 0, < length, integer. It looks like Matlab performs this in x(1:k) only for 1 and k, while in x([1:k]) this test is applied to each element as in the 3rd case x(v).

A similar effect occurres for logical indexing: Matlab has to check only the size of the index vector once, not for each element.

tic

Lv = false(size(x));

for k = 1:10000

Lv(k) = true;

x(Lv) = k;

end

toc

Elapsed time is 0.176343 seconds.

It is plausible that this is slower than x(1:k), because the values of the mask Lv have to be considered. But it is much faster than x([1:k]) or the equivalent usage of an index vector.

Richard Brown
on 19 Apr 2012

OK, I looked into this a little more rigorously. I thought I'd test James Tursa's suggestion from my previous answer to see if the order of the tests is actually important. So I did 500 repetitions of computing 1:100 100,000 times, with and without enclosing square brackets. I did the experiment twice, once with the square brackets first, once with them second. I performed a two-tailed paired t-test testing for a difference of the mean(t1 - t2) from zero.

n = 100000;

N = 500;

t1 = zeros(1, N);

t2 = zeros(1, N);

for k = 1:N

tic

for i = 1:n

A = [1:100];

end

t1(k) = toc;

tic

for i = 1:n

A = 1:100;

end

t2(k) = toc;

end

t = mean(t1 - t2) / (std(t1 - t2) / sqrt(N));

For the brackets first experiment I got a t-value of 1.18, and brackets second got a t-value of -0.3. The p-value for 95% significance in both cases is +/- 2.2, so in neither case is there a statistically significant difference between brackets and no brackets.

EDIT as Jan Simon points out in this thread http://www.mathworks.com/matlabcentral/answers/35972-how-to-best-time-differences-between-function-implementations A should be cleared each iteration. This makes an enormous difference - suddenly the version with the brackets is around half the speed (t values around 300) as without (and n needs to come down by two orders of magnitude). The JIT compiler had obviously recognised that it only had to define A once! I'll leave the original code as otherwise things will get confusing!

So, m-lint is correct!

SECOND EDIT

See my comment below for more comments.

Daniel Shub
on 19 Apr 2012

Oleg Komarov
on 20 Apr 2012

I wasn't clear about the overhead of the loop and the 1:100 but Daniel got it. Variability in the overheads might be greater than [] effect. In fact, this is what I get, a huge change in the t-ratios.

Also, like with financial data, the lower the frequency (monthly, quarterly data) the closer to normality...i.e. if you time 500 times the sum of times (1:n) instead of timing 'n' times it does matter for the distribution of t1, t2.

Richard Brown
on 20 Apr 2012

Hi Oleg et. al. This problem is getting more and more tricky! Comments:

Firstly, on my system I get no difference at all between using 1:2 and 1:100, the mean simulation time is the same (in the first s.f. at least). I think 1:100 and 1:2 basically have the same overhead - which presumably is the cost of the call to ':'. It is possible that this call has more variability than the call to [], but there's very little we can do to control or measure that. And if that is in fact the case, then the m-lint message is unnecessary. The m-lint message is implying that the cost of the call to [] is significant compared with the cost of ':'.

Secondly, it's essential to clear A after the call to [1:2] or 1:2 -- the JIT optimises away all subsequent calls if you don't do this, so my initial results were not relevant. Essentially it was timing a loop full of no-ops.

Thirdly, if I have no semicolon! on the t = ... line, then I get large t-values. If I have a semicolon, then I again get t-values of the order of 1 or less. Not sure what the deal is there. I also observe differences in behaviour between my 64 bit Win7 and Ubuntu installs.

So it is difficult to disentangle these results from the behaviour of the JIT compiler, and presumably the calls to tic and toc.

@Oleg, this is not like financial data. The point I think that you are making is that samples close together in time are correlated, so to get approximately independent samples you need to sample less frequently. These samples should be largely independent, although this assumes a uniform system load during the simulation time (which I'm not going to bother to try to control).

The inner n needs to be large so that the central limit theorem applies - the mean (and hence sum) of the 'n' iterations should be pretty close to normally distributed, and so the distribution of t1 and t2 should be very close to normal, making a paired t-test appropriate.

Conclusions? Not quite sure. I think that there is very little difference between [] and not, but it's pretty dependent on the JIT.

Jan
on 16 Apr 2012

1:100 is a vector already. Addition square brackets joins all elements to a vector. And this does not change anything on the data, but needs time.

If a grouping is required, parenthesis are more efficient, because the do not cost any runtime. They are considered during parsing the M-file.

[EDITED]:

The square brackets could be overloaded, if the contents contains user-defined objects. Therefore the JIT should hesitate to "optimize them away". Imagine:

a = 3; b = 10;

for i = 1:100

if rand > 0.5

eval('b = myStrangeUserDefinedObject'); % Don't do this!

end

v = [a:b]; % ? Is [.] overloaded now ?

end

Another idea: The runtime difference is small, but measurable, it might vanish with some JIT versions. But the code is cleaner and possibly easier to debug, if the unnecessary brackets are omitted. Compare:

x = a:b; % Obviously clean

y = [a:b]; % Obviously or not?!

z = [[a:b]]; % Obviously messy

The last line forces the reader to think twice and some doubts will remain, while the first line is perfectly clear. The intermediate case should catch the attention also, therefore I prefer "a:b" for reasons of simplicity.

Richard Brown
on 17 Apr 2012

James Tursa
on 17 Apr 2012

Richard Brown
on 20 Apr 2012

eval should quite simply be deprecated. For the record, my preferences are the same. I really don't like:

[a:b]

If you must group them for visual effect, use parentheses. [] should only be used for concatenations.

Jan
on 9 Aug 2017

Using square brackets, because they look matlabish and have anything to do with vectors, is Cargo Cult Programming, see https://en.wikipedia.org/wiki/Cargo_cult_programming. [ ] is the concatenation operator and nothing else and therefore corresponding warnings appear in the editor.

While the overhead for this call is really tiny, it is valuable and important to be aware of Cargo Cults and to clean up the programming techniques. See also: Wiki: Programming Anti-Patterns.

Richard Brown
on 16 Apr 2012

A quick test reveals that there is a small cost to including them

tic

for i = 1:1000000

A = [1:100];

end

t1 = toc;

tic

for i = 1:1000000

A = 1:100;

end

t2 = toc;

disp(t1 - t2)

The displayed number is always positive. My guess is that when it sees the brackets it needs to determine whether a call to horzcat or vertcat is required.

Jan
on 21 Aug 2017

Edited: Jan
on 21 Aug 2017

An equivalent effect occurs in for loops:

v = 1:1e5;

r = zeros(1, 1e5);

tic;

for loop = 1:1000

for k = v % Loop over pre-defined vector

r(k) = k;

end

end

toc

r = zeros(1, 1e5);

tic;

for loop = 1:1000

for k = 1:1e5 % Loop over vector defined by limits

r(k) = k;

end

end

toc

Elapsed time is 3.304159 seconds.

Elapsed time is 0.700051 seconds.

It seems like the JIT can handle for k=a:b much more efficiently. The advantage is even higher, if the time to create v=1:1e5 is considered in addition.

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
## 1 Comment

## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/35676-why-not-use-square-brackets#comment_543229

⋮## Direct link to this comment

https://www.mathworks.com/matlabcentral/answers/35676-why-not-use-square-brackets#comment_543229

Sign in to comment.