Why not use square brackets?

100 views (last 30 days)
Chad Greene
Chad Greene on 16 Apr 2012
Commented: Bruno Luong on 3 Aug 2023
Why does Matlab suggest that I shouldn't use square brackets unless absolutely necessary? For example, if I type
x = [1:10];
the default code-checking feature suggests "Use of brackets [] is unnecessary. Use parentheses to group, if needed." I realize that the square brackets here are not necessary, but is there some cost to including them? Are parentheses more computationally efficient?
  1 Comment
Stephen23
Stephen23 on 8 Mar 2018
Edited: Stephen23 on 20 Jun 2019
The best reason for not using square brackets like that: they don't do anything in this situation. It is easy to fill code with plenty of other useless additions that don't do anything, but making code more complex than it needs to be just makes code harder to understand, debug, and maintain:
V = [+(+([+1])):[-(-9)]]
vs. simply
V = 1:9
Don't add anything to your code that does not serve a purpose.

Sign in to comment.

Accepted Answer

Jan
Jan on 26 Jan 2017
Edited: Jan on 26 Jan 2017
The square brackets impede the boundary checks when the vector is used for indexing:
x = zeros(1, 10000);
tic
for k = 1:10000
x(1:k) = k;
end
toc
x = zeros(1, 10000);
tic
for k = 1:10000
x([1:k]) = k; % <- [...] added
end
toc
tic
for k = 1:10000
v = 1:k;
x(v) = k;
end
toc
Elapsed time is 0.087256 seconds.
Elapsed time is 0.374184 seconds. !!! Factor 4 !!!
Elapsed time is 0.375447 seconds.
When Matlab accesses an array element, it has to check, if the index is inside the allowed range: > 0, < length, integer. It looks like Matlab performs this in x(1:k) only for 1 and k, while in x([1:k]) this test is applied to each element as in the 3rd case x(v).
A similar effect occurres for logical indexing: Matlab has to check only the size of the index vector once, not for each element.
tic
Lv = false(size(x));
for k = 1:10000
Lv(k) = true;
x(Lv) = k;
end
toc
Elapsed time is 0.176343 seconds.
It is plausible that this is slower than x(1:k), because the values of the mask Lv have to be considered. But it is much faster than x([1:k]) or the equivalent usage of an index vector.
  1 Comment
Bruno Luong
Bruno Luong on 3 Aug 2023
Edited: Bruno Luong on 3 Aug 2023
@Jan Your logical loop is not directly comparable with the others, this is fairer
tic
Lv = false(size(x));
for k = 1:10000
Lv(1:k) = true; % switch the entire head, not hust single element
x(Lv) = k;
end
toc
The conclusion is still hold though, and thanks for the head up of this interesting effect.

Sign in to comment.

More Answers (5)

Jan
Jan on 16 Apr 2012
1:100 is a vector already. Addition square brackets joins all elements to a vector. And this does not change anything on the data, but needs time.
If a grouping is required, parenthesis are more efficient, because the do not cost any runtime. They are considered during parsing the M-file.
[EDITED]:
The square brackets could be overloaded, if the contents contains user-defined objects. Therefore the JIT should hesitate to "optimize them away". Imagine:
a = 3; b = 10;
for i = 1:100
if rand > 0.5
eval('b = myStrangeUserDefinedObject'); % Don't do this!
end
v = [a:b]; % ? Is [.] overloaded now ?
end
Another idea: The runtime difference is small, but measurable, it might vanish with some JIT versions. But the code is cleaner and possibly easier to debug, if the unnecessary brackets are omitted. Compare:
x = a:b; % Obviously clean
y = [a:b]; % Obviously or not?!
z = [[a:b]]; % Obviously messy
The last line forces the reader to think twice and some doubts will remain, while the first line is perfectly clear. The intermediate case should catch the attention also, therefore I prefer "a:b" for reasons of simplicity.
  4 Comments
James Tursa
James Tursa on 17 Apr 2012
Maybe try the loops in reverse order (no [ ] first, then [ ]) and see if t1-t2 is always negative to support that it is in fact the brackets making the difference and not just the order of the loops.
Richard Brown
Richard Brown on 20 Apr 2012
eval should quite simply be deprecated. For the record, my preferences are the same. I really don't like:
[a:b]
If you must group them for visual effect, use parentheses. [] should only be used for concatenations.

Sign in to comment.


Richard Brown
Richard Brown on 19 Apr 2012
OK, I looked into this a little more rigorously. I thought I'd test James Tursa's suggestion from my previous answer to see if the order of the tests is actually important. So I did 500 repetitions of computing 1:100 100,000 times, with and without enclosing square brackets. I did the experiment twice, once with the square brackets first, once with them second. I performed a two-tailed paired t-test testing for a difference of the mean(t1 - t2) from zero.
n = 100000;
N = 500;
t1 = zeros(1, N);
t2 = zeros(1, N);
for k = 1:N
tic
for i = 1:n
A = [1:100];
end
t1(k) = toc;
tic
for i = 1:n
A = 1:100;
end
t2(k) = toc;
end
t = mean(t1 - t2) / (std(t1 - t2) / sqrt(N));
For the brackets first experiment I got a t-value of 1.18, and brackets second got a t-value of -0.3. The p-value for 95% significance in both cases is +/- 2.2, so in neither case is there a statistically significant difference between brackets and no brackets.
EDIT as Jan Simon points out in this thread http://www.mathworks.com/matlabcentral/answers/35972-how-to-best-time-differences-between-function-implementations A should be cleared each iteration. This makes an enormous difference - suddenly the version with the brackets is around half the speed (t values around 300) as without (and n needs to come down by two orders of magnitude). The JIT compiler had obviously recognised that it only had to define A once! I'll leave the original code as otherwise things will get confusing!
So, m-lint is correct!
SECOND EDIT
See my comment below for more comments.
  6 Comments
Oleg Komarov
Oleg Komarov on 20 Apr 2012
I wasn't clear about the overhead of the loop and the 1:100 but Daniel got it. Variability in the overheads might be greater than [] effect. In fact, this is what I get, a huge change in the t-ratios.
Also, like with financial data, the lower the frequency (monthly, quarterly data) the closer to normality...i.e. if you time 500 times the sum of times (1:n) instead of timing 'n' times it does matter for the distribution of t1, t2.
Richard Brown
Richard Brown on 20 Apr 2012
Hi Oleg et. al. This problem is getting more and more tricky! Comments:
Firstly, on my system I get no difference at all between using 1:2 and 1:100, the mean simulation time is the same (in the first s.f. at least). I think 1:100 and 1:2 basically have the same overhead - which presumably is the cost of the call to ':'. It is possible that this call has more variability than the call to [], but there's very little we can do to control or measure that. And if that is in fact the case, then the m-lint message is unnecessary. The m-lint message is implying that the cost of the call to [] is significant compared with the cost of ':'.
Secondly, it's essential to clear A after the call to [1:2] or 1:2 -- the JIT optimises away all subsequent calls if you don't do this, so my initial results were not relevant. Essentially it was timing a loop full of no-ops.
Thirdly, if I have no semicolon! on the t = ... line, then I get large t-values. If I have a semicolon, then I again get t-values of the order of 1 or less. Not sure what the deal is there. I also observe differences in behaviour between my 64 bit Win7 and Ubuntu installs.
So it is difficult to disentangle these results from the behaviour of the JIT compiler, and presumably the calls to tic and toc.
@Oleg, this is not like financial data. The point I think that you are making is that samples close together in time are correlated, so to get approximately independent samples you need to sample less frequently. These samples should be largely independent, although this assumes a uniform system load during the simulation time (which I'm not going to bother to try to control).
The inner n needs to be large so that the central limit theorem applies - the mean (and hence sum) of the 'n' iterations should be pretty close to normally distributed, and so the distribution of t1 and t2 should be very close to normal, making a paired t-test appropriate.
Conclusions? Not quite sure. I think that there is very little difference between [] and not, but it's pretty dependent on the JIT.

Sign in to comment.


Richard Brown
Richard Brown on 16 Apr 2012
A quick test reveals that there is a small cost to including them
tic
for i = 1:1000000
A = [1:100];
end
t1 = toc;
tic
for i = 1:1000000
A = 1:100;
end
t2 = toc;
disp(t1 - t2)
The displayed number is always positive. My guess is that when it sees the brackets it needs to determine whether a call to horzcat or vertcat is required.
  1 Comment
Bruno Luong
Bruno Luong on 3 Aug 2023
Be aware The conclusion is NOT reliable, I simply swap the order and now the bracket is faster
tic
for i = 1:1000000
A = 1:100;
end
t1 = toc
t1 = 0.0132
tic
for i = 1:1000000
A = [1:100];
end
t2 = toc
t2 = 0.0087
disp(t1 - t2)
0.0046

Sign in to comment.


Jan
Jan on 9 Aug 2017
Using square brackets, because they look matlabish and have anything to do with vectors, is Cargo Cult Programming, see https://en.wikipedia.org/wiki/Cargo_cult_programming. [ ] is the concatenation operator and nothing else and therefore corresponding warnings appear in the editor.
While the overhead for this call is really tiny, it is valuable and important to be aware of Cargo Cults and to clean up the programming techniques. See also: Wiki: Programming Anti-Patterns.

Jan
Jan on 21 Aug 2017
Edited: Jan on 21 Aug 2017
An equivalent effect occurs in for loops:
v = 1:1e5;
r = zeros(1, 1e5);
tic;
for loop = 1:1000
for k = v % Loop over pre-defined vector
r(k) = k;
end
end
toc
r = zeros(1, 1e5);
tic;
for loop = 1:1000
for k = 1:1e5 % Loop over vector defined by limits
r(k) = k;
end
end
toc
Elapsed time is 3.304159 seconds.
Elapsed time is 0.700051 seconds.
It seems like the JIT can handle for k=a:b much more efficiently. The advantage is even higher, if the time to create v=1:1e5 is considered in addition.
  2 Comments
Bruno Luong
Bruno Luong on 3 Aug 2023
The difference is less dramatic now than in 2017
wrapper()
Elapsed time is 0.331812 seconds. Elapsed time is 0.138714 seconds.
function wrapper
v = 1:1e5;
r = zeros(1, 1e5);
tic;
for loop = 1:1000
for k = v % Loop over pre-defined vector
r(k) = k;
end
end
toc
r = zeros(1, 1e5);
tic;
for loop = 1:1000
for k = 1:1e5 % Loop over vector defined by limits
r(k) = k;
end
end
toc
end

Sign in to comment.

Categories

Find more on Matrices and Arrays in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!