How do I avoid getting fooled by 'implicit expansion'?
22 views (last 30 days)
Show older comments
I am trying my very best not to be a grumpy old man here, but I just wasted the better part of an hour because of the addition of 'implicit expansion', and I need to hear from the proponents of this feature how to live with it. The offending bit of code was:
smoothed=smooth(EEG.data(iC,:),EEG.srate*60,'moving');
deviation=EEG.data(iC,:)-smoothed;
this bit of code kept giving me a 'memory error', which was odd, since 'deviation' should be a lot smaller than 'EEG'. However, EEG is very large, and I don't have room for two of them in my memory, so I figured there might be some intermediate step in calculation that was tripping me up, or perhaps windows was hogging the resources, or something else out of my control. It took two restarts of matlab, and one reboot of the pc, and finally a complicated rewrite of the script (which still didn't fix it) to finally realize that 'smooth' (which is a matlab built-in) was changing the dimensions, such that I was subtracting a column vector from a row vector.
What is the good coding practice for this not to occur? I would have thought that having a language that complained when you performed an ill-defined operation was the good solution to this problem, but I can see from my google search that apparently a lot of people think that 'implicit expansion' is a great good. How do you avoid pitfalls such as this? (please note that if I hadn't been running out of memory, I might have made it a good deal further down the script before noticing that something was off).
Should I just never trust any command to preserve the dimensions of my arrays, even if it's an inbuilt one?
7 Comments
Jan
on 16 Feb 2022
@Alexander Thomas: "make it optional to implicitly expand" - No, this cannot work. Remember, that Matlab's toolbox function expect an enabled implicit expansion. If a switching is introduced, each and every toolbox function must store the former setting, enable the expansion and restore the former value finally. This would waste too much time.
Many of the codes, I write here in the forum to solve questions make use of the implicite expansion. In the first 3 years I've commented this by "% Auto-expanding, >= R2016b" and added a comment with the corresponding bsxfun call. Today Most of the Matlab users in this forum are familiar with this feature. Removing it or even allowing to switch it of manually would cause serious incompatibilities.
It was always a feature and a problem, that Matlab tries to be smart. Prefering to operator along the first nonsingelton dimension is convenient and dangerous, because beginners tend to forget, that the dimensions can differ from their expectations. The length command was a really bad idea, findstr(a,b) also: It searched the shorter in the longer of the elements. Of course "hold on" save some keyclicks compared to "hold('on')", but the non-functional form of commands was a source of bugs frequently in the past, when Matlab's guessing fails, if the argument is a char vector or number. The command plot(1:10, rand(1,10)) creates an axes automagically and a surrounding figure as well - except if there is an existing already.
Implicite expanding is another smart feature. It would have been a more secure decision to introduce new operators like $+, $*, etc. But MathWorks decided for making it transparent. I was not happy about it, because I prefer a program to stop, if something unexpected happens, but the auto-magic produces an unexpected result instead. But it the auto-expanding is applied intentionally, it is an efficient and powerful tool.
@John D'Errico: "Implicit expansion is a tool that as you gradually become a more experienced user, you will appreciate" - I do not agree. I'd prefer an explicit operator instead of increasing the power of existing operators. But, as you said: Such is life, especially as a programmer. It is part of Matlab now and I use it. The questions in this forum show, that the expanding is not a frequent cause of bugs, e.g. rand(1e6) appears more frequently.
@kaare: Thanks for caring about the tone in the forum. As I read John's answer, the most emotional part is: "Sigh". I do neither see a high horse, nor arrogance, nor presumptuousness, nor a condescending statement. Of course, politeness is the base of an efficiently working community.
Accepted Answer
Jan
on 19 Feb 2018
Edited: Jan
on 16 Feb 2022
Should I just never trust any command to preserve the dimensions of my arrays,
even if it's an inbuilt one?
The shapes of the output of built-in function have been subject to changes in the past and I expect this to happen in the future also. Therefore I catch my assumptions about shapes in a "unit-test" like function. So when I write
smoothed = smooth(EEG.data(iC,:), EEG.srate*60, 'moving');
I add this to the test function:
y = smooth(rand(1, 100), 5, 'moving');
if ~isrow(y)
error('SMOOTH does not reply a row vector for a row vector input.');
end
Nevertheless, it is a lot of work to do this for all assumptions. In many cases I'm not even aware of what I assume, e.g. for strncmp('hello', '', 2), which has changed its behavior in the past also.
In your case it would have been smart and efficient, if
deviation = EEG.data(iC,:) - smoothed
causes an error. Unfortunately the implicit expansion tries to handle this smartly, but it is smarter than the programmer in many cases. When it is intended, the implicit expansion is nice and handy, but it is an invitation for bugs also. All we can do is to live with it, because it is rather unlikely that TMW removes this feature. But I cannot be bad to write this as an enhancement request to them.
To answer the actual question:
How do I avoid getting fooled by 'implicit expansion'?
Use Matlab < R2016b, at least for testing your code.
0 Comments
More Answers (4)
Guillaume
on 19 Feb 2018
Yes, some complain about implicit expansion. For me, it's a logical expansion (pun intended) of the 1-D scalar expansion to N-D. On the other hand, I was also happy with bsxfun.
As to your problem, at the heart it's a design failure on your part I'm afraid. If you never validate your assumptions, you can expect that things go wrong. Instead of a single large script, use functions. The first thing that a function should do is validate that its inputs are as expected. If two vectors are expected as input, check they're the same size, etc. Whenever I write a function, the first thing written is the help, with a clear listing of the input and their requirements, followed by actual validation of these requirements.
In your case, a simple
assert(size(v1) == size(v2))
would have caught the problem.
The implicit expansion forces you to be more careful about the shape of your vectors. In my opinion, it's not a bad thing.
2 Comments
John D'Errico
on 19 Feb 2018
Edited: John D'Errico
on 19 Feb 2018
Admittedly, I would not be using assert here to check dimensions. But the fact remains that I would KNOW the shape of my arrays. If you don't know positively what shape arrays and vectors are produced by any operation, then it pays to learn to be more careful.
When you use a function, make sure you know what it returns. Read the help. Try a test case, in case the help was not sufficiently clear for you.
Matt J
on 19 Feb 2018
Well, the error message would have told you that the out-of-memory was originating in the line
deviation=EEG.data(iC,:)-smoothed;
I should think that checking the dimensions of the right hand side quantities in the debugger would have been your first step.
Petorr
on 20 Jun 2022
I would like the debugger to automatically highlight where implicit array expansion is taking place. Is that option available? For example:
Here, A might be 100*1 and B, 1*500 so the highlighting would let me know that these are compatible unequal array sizes and will be implicitly expanded.
2 Comments
Steven Lord
on 21 Jun 2022
No such option exists. Would you expect this option to always highlight that operation in your code? A might be 100-by-1 and B might be 1-by-500. But they may both be scalars. There are circumstances where a static analysis of the code could prove at parse-time that the operation will perform implicit expansion, but what about this one?
function z = computeProduct(x, y)
z = x.*y;
end
Should the .* in this code be highlighted or not?
Petorr
on 21 Jun 2022
In this example, it would only be highlighted while debugging within computeProduct, as in putting a breakpoint at the line z=x.*y. It would be comparable to the hover-tip that shows array dimensions. I use that often, even though it depends on the program state. I see how it might seem a little inconsistent since all other highlighting is based on static parsing, but some variation of it might be worth considering. If I get so good with the implicit sizing that I never need this feature, I'll be sure to come back and comment ; )
Rav
on 25 Mar 2024
Ok, since I use eeglab I managed to trace your screw-up.
EEG.data is sort-of horizontal, but 'smooth' outputs a vertical array.
Actually, the error message also shows what's wrong, look at dimensions. That's not the expansion
Correct debig would be to call "size(smoothed)" and "size(EEG.data)".
Correct solution to your code is this:
deviation=EEG.data(iC,:)-smoothed.';
'smoothed' here is transposed - a difference of 2 symbols
A good coding practice is to run all unfamiliar and failing functions through console by hand and just look at the output - not just console output, but also environment.
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!