logical indexing with a smaller array should throw a warning

24 views (last 30 days)
I am a heavy user of logical indexing. I think the default behaviour of allowing indexing with a differently sized logical array without warning, see e.g.
is a very dangerous practice. Even worse, the documentation is actually wrong: you can over-index an array until the extra bits of the logical array is all-false. For example,
x = ones(10,1);
l = [ true ; false(15,1)];
x(l);
runs without any error or warning. If you are unlucky enough to have an array with mostly false at the end, you will probably not detect related bugs for a long time.
Logical indexing is mostly used to restrict data to a subset. The logical index is often generated by complex calculations, and then the restricted and resized dataset is processed further in the same piece of code. Often, different subsets and datasets are used within the same code, with different sizes. A significant number of bugs can be avoided if the size of the logical is checked against the array.
I strongly believe that there should exist an option to enable warnings about using incorrectly sized logical array for indexing.
Anybody has an idea on how to deal with this issue, apart from defining a function like
function r = sa(x,l)
if any(size(x) ~= size(l))
error('Incorrect assignment.');
end
r = x(l);
and littering the nice x(l) references with ugly sa(x,l) everywhere?
  5 Comments
Daniel Shub
Daniel Shub on 11 Jan 2013
@Baliant, I was saying I don't know why you would want to index an array of length N with a logical array of M where M > N. I can see your desire for a warning or error.
Balint Takacs
Balint Takacs on 11 Jan 2013
Edited: Balint Takacs on 11 Jan 2013
@Sean: MATLAB does error with indices, but does not with logicals.
Consider the following:
x = rand(10,1);
small = x < 0.1;
large = x > 0.9;
x_middle = x(~small & ~large);
x_small = x(small); % correct version
% coder forgets which data space 'small' is in,
% and introduces a semantic bug ...
x_small = x_middle(small);
% ... which is not detected until rand()
% draws a vector with lots of small values (never in practice).
Because logical indexing is not bound to the data they are used on, like iterators in other languages, it is quite hard to track in mind which data space they are belonging to, especially if there are lots of them. These type of bugs are quite easy to add in practice.
Checking the size will not entirely solve this lack of semantic connection, but at least gives an opportunity to detect cases when there is a high probability of it to happen.

Sign in to comment.

Answers (4)

Jim Svensson
Jim Svensson on 6 Apr 2023
Matlab should definitely require that the logical indexing mask is exactly the same size as the data being indexed. Not doing so is very bad.

Jan
Jan on 10 Jan 2013
Logical indexing is even not implemented efficiently. I'm going to publish a faster version in the FileExchange, but it handles the right hand side of assignments only. I do not know how to replace the left hand side assignment e.g. in:
L = rand(1, 100) < 0.5;
X = rand(10, 10);
X(L) = X(L) - 1; % MEXing the RHS is easy, but the LHS?!
any(size(x) ~= size(l)) is a bad idea, because it fails when x and l have a different number of dimensions, e.g. in the example above. Mixing of linear indexing and logical indexing is important and very useful.
I'm sure, that the behavior of the logical indexing will not be changed to support backward compatibility.
  2 Comments
Balint Takacs
Balint Takacs on 10 Jan 2013
They should not change its behaviour, but they can add a warning which can be turned off. BTW I want the thing to fail when the number of dimensions are mismatching.
Jan
Jan on 10 Jan 2013
Edited: Jan on 10 Jan 2013
@Balint: I cannot believe that you want to get a strange error message about a bad usage of the eq operator. The test should not fail in case of a mismatch, but reply TRUE:
if ~isequal(size(x), size(l))
error('Incorrect assignment in logical index operation.');
end
Did you measure the time, which is required to ignore a warning? It is a surprisingly high overhead and when the warning would be enabled, the users might be confused by getting dozens of warnings from ocrrectly working toolbox functions.
Therefore I suggest to test the dimensions explicitly, instead of injecting this extra test in the standard functionality.

Sign in to comment.


Jonathan Sullivan
Jonathan Sullivan on 10 Jan 2013
This is an interesting proposition. While I'm very much against throwing an error in this case, I would be open to having a warning issued. But not in the case of the sizes being different, but rather only when the number of elements are different.
I have been known to use column vectors to index row vectors and vice versa, and I think that is OK. But I do want to be made aware when I'm using a 100 element logical array to index a vector that has 150 elements.
Something like:
if numel(x) ~= numel(l)
warning('Logic Index array has a different number of elements than the array being indexed.');
end

Matt J
Matt J on 10 Jan 2013
I tend to agree with you about the dangers. If it's a feature, it's one I have never had use for in many years of using MATLAB. The only rationale for it that I can think of is that it can save you memory, if you know your trues are concentrated in the beginning of the index array, to discard the trailing falses.
One option is to define your own sub-class of double (or whatever) and write a subsref method that throws the warning. Below is the beginnings of such a sub-class, with an illustration of its use.
>> x=myclass(1:10);
>> l=[ true(3,1) ; false(15,1)];
>> x(l)
Warning: Logical mask of untypical size
> In myclass>myclass.subsref at 25
ans =
1 2 3
classdef myclass<double
methods
function obj=myclass(data)
obj@double(data);
end
function out = subsref(obj,S)
dims=size(obj);
n=ndims(obj);
if n==2, dims(end)=[]; n=1; end
for ii=1:n
idx=S.subs{ii};
if islogical(idx) && numel(idx)~=dims(ii)
warning 'Logical mask of untypical size'
end
end
out = subsref@double(obj,S);
end
function display(obj)
display(double(obj))
end
end
end
  1 Comment
Walter Roberson
Walter Roberson on 10 Jan 2013
I, of course, have taken advantage of the feature from time to time ;-) Saves having to calculate the padding with false() that I would have to add to make the number of elements the same.

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!