MATLAB Answers

How to convert a numeric string into a numeric range?

60 views (last 30 days)
I am working with a GUI which allows users to select custom groups of numbers. The inputs are always stored as strings; however, I need to convert the string to a range of numbers.
For example, if the user inputs...
[1:3,5,7:9]
Then I would like to have a stored value of...
[1, 2, 3, 5, 7, 8, 9]
Is there a way to do this without using eval()?
eval('[1:3,5,7:9]')
I know the use of eval() is frowned upon, but I cannot think of a more efficient method. My only other idea has been to use regexp() which takes much more time because of all the conditional aspects of the search.
Note: I know this can be done in the Command Window, but I am attempting to only use GUI functions or other similar functions that create a pop-up, such as:
inputdlg()

  0 Comments

Sign in to comment.

Accepted Answer

Stephen Cobeldick
Stephen Cobeldick on 20 Sep 2017
Edited: Stephen Cobeldick on 8 Mar 2020
function out = str2vec(str)
vec = sscanf(str(2:end),'%f%c');
out = [];
idb = 1;
ide = 1;
while idb<=numel(vec)
ide = idb+2*(vec(idb+1)==58); % 58==':'
out = [out,vec(idb):vec(ide)];
idb = ide+2;
end
end
It allows any decimal or integer numbers (including optional +/- sign and E-notation), separated by either one colon or one comma. For each number leading space characters are ignored, whereas trailing spaces cause an incorrect output. It could be adapted to allow for the optional step of the colon command.
Outputs using your example data:
28 29 30 31 32 33 5 7 8 9 20 % this function
28 29 30 31 32 33 5 7 8 9 20 % eval

  3 Comments

OCDER
OCDER on 20 Sep 2017
Hi Stephen, in my computer, your function is actually faster than eval (or just as fast)!
Str = '[33,37,-1:-4,1:30]'
tic
for j = 1:10000
Range = str2vec(Str);
end
toc %Elapsed time is 0.138589 seconds.
tic
for j = 1:10000
Range = eval(Str);
end
toc %Elapsed time is 0.144505 seconds.
Walter Roberson
Walter Roberson on 20 Sep 2017
It is better to time with timeit() than with tic/toc
Samuel Clary
Samuel Clary on 22 Sep 2017
This function runs faster on my computer than eval() as well. I have been testing it with several different inputs and it has been working very well. On larger ranges it can slow down, but I will throw in some warnings for the user if they try.
I am very interested in your function though. I have not had an opportunity to really look through it. (I have never used the vec() function.) Although, I am very interested in figuring out why it works so quickly.

Sign in to comment.

More Answers (3)

OCDER
OCDER on 19 Sep 2017
Edited: OCDER on 19 Sep 2017
In case the user inputs out-of-order range, duplicate numbers, or negative numbers, this solution works too and is ~3x faster. But you may need more error handling features - can't predict all the types of inputs.
Str = '1:3,-9:-4,7:9'; %User inputs a weird range. No brackets needed
StrParts = cellfun(@(x) regexp(x, ':', 'split'), regexp(Str, '\-*\d+:\-*\d+|\-*\d+', 'match'), 'UniformOutput', false);
NumParts = cellfun(@(x) str2double(x(1)):str2double(x(end)), StrParts, 'UniformOutput', false);
Range = unique(cat(2, NumParts{:}));
Range =
-9 -8 -7 -6 -5 -4 1 2 3 7 8 9

  3 Comments

Samuel Clary
Samuel Clary on 20 Sep 2017
My problem with splitting the string using regexp() is that it is actually longer to get the result this way. Now, to be fair, we are talking ~1.5 msec for regexp(), but this is still roughly 26x slower than eval() at ~0.06 msec. My goal is to truly optimize with regards to speed. I really do appreciate your suggestion, though! That is the first time I have understood cellfun(). The examples I have seen up until this point never really clicked.
Walter Roberson
Walter Roberson on 20 Sep 2017
It is unlikely that you would be able to improve on eval() speeds, as eval() runs at compiled speeds whereas anything you do at the MATLAB level is at interpreted speeds.
To get something more robust but at compiled speeds you would need to move into a mex routine.
OCDER
OCDER on 20 Sep 2017
Yeah, I can't find something faster than eval. I do have a faster solution that is only 2.6 times slower than eval based on 10000 iterations. See below. Otherwise, Walter's solution to use MEX or Jan's solution to use an eval with safety check would be faster.
Str = '[1:3,-9:-4,7:9]';
%Newer answer
tic
for k = 1:10000
StrParts = regexp(Str, '\-*\d+\:*', 'match');
j = 1;
while j <= length(StrParts)
if StrParts{j}(end) == ':'
StrParts{j} = str2double(StrParts{j}(1:end-1)):str2double(StrParts{j+1});
StrParts{j+1} = [];
j = j + 2;
else
StrParts{j} = str2double(StrParts{j});
j = j + 1;
end
end
Range = unique(cat(2, StrParts{:}));
end
toc %Elapsed time is 0.926773 seconds.
%Previous answer
tic
for k = 1:10000
StrParts = cellfun(@(x) regexp(x, ':', 'split'), regexp(Str, '\-*\d+:\-*\d+|\-*\d+', 'match'), 'UniformOutput', false);
NumParts = cellfun(@(x) str2double(x(1)):str2double(x(end)), StrParts, 'UniformOutput', false);
Range = unique(cat(2, NumParts{:}));
end
toc %Elapsed time is 3.294332 seconds.
%Eval answer
tic
for k = 1:10000
Range = unique(eval(Str));
end
toc %Elapsed time is 0.352939 seconds.

Sign in to comment.


Walter Roberson
Walter Roberson on 19 Sep 2017
rng = @(a,b) strjoin(cellstr(num2str((str2double(a):str2double(b)).')),',');
S = '[1:3,5,7:13]'
result = str2double( regexp( regexprep(S, {'\[', ']', '(\d+):(\d+)'}, {'', '', '${rng($1,$2)}'}), '\s*,\s*', 'split') );
Note: this code assumes that entries are separated by comma (which might have spaces around them) not by spaces alone.

  1 Comment

Samuel Clary
Samuel Clary on 20 Sep 2017
I have run this code when I tested Donald Lee's (commenter above) code. Unfortunately, this code is slower than eval() as well and my goal is to optimize for speed. I attempted to do something similar to this on my own, but I did not use regexprep(). I will keep that in mind for my future codes. Thank you very much for the input!

Sign in to comment.


Jan
Jan on 19 Sep 2017
As long as eval processes numbers and the colon only, and does not create a variable dynamically, it is not evil. You could think of a security check:
Str = '[1:3,5,7:9]';
if ~all(ismember(Str, '0123456789+-.,:'))
error('Cannot process string securely');
end
v = eval(Str);
But as soon as expressions like "1e6" are considered, the problems begin: A user could type "eeee" and define a corresponding function. Then you need some regular expressions to examine the string to recognize valid numbers in scientific notation. But if this is implemented, using the output or regexp will be easier than eval-ing.
See the other two answers for constructive suggestions.

  5 Comments

Show 2 older comments
Guillaume
Guillaume on 20 Sep 2017
In the end, it comes down to: can you trust the user? If the user can be trusted not to enter anything nefarious (such as rmdir('c:\', 's')) then using eval would be acceptable. If not, then you need to validate the string which likely involves regexp at which point there's no point in the eval anymore.
Note that if the program is useful, the potential user may change from trusted to untrusted as the program gets more widely used. By which time, it will have been forgotten that the parsing method did not validate its input and all hell may break loose. Therefore, coding defensively to start with would be safer.
Jan
Jan on 20 Sep 2017
@Guillaume: You can avoid the need to trust the user by setting Matlab to a defined state before calling eval:
Str = input('Input what ever you want:', 's')
system('format C:')
exit;
eval(Str); % ;-)
You can never trust a user, see Cody: Cheating is challenging for many users. Shadowing the functions, which determine if the result is correct, was the beginning only. What about activating sendmail of the underlying Linux sandbox? Or perhaps you can copy a dump of the complete virtual machine including the license to your dropbox?
I could never understand, why MathWorks offers such a powerful "eval it for me" service without a certified user identification. It is an invitation for illegal activities.
I've worked in a lab, in which the computers were completely boarded up by the IT staff: Windows without task manager and user access to any control panel, command window and power shell. After an IT admin fixed a computer, he left the volume of the internal speakers at 100%, such that each beep blowed away my brain. It would have taken 2 days until he had time for us again. Therefore I've started a compiled Matlab application, and used an eval'ed edit field to start:
system('rundll32.exe shell32.dll,Control_RunDLL mmsys.cpl,,0 &');
to access the sound control panel. Disabling a warning sound is not an illegal activity and of course I've informed the admins about what I was doing.
The sscanf library of old Matlab versions contained a bug, which allowed to gain admin privileges. The Java engine shipped with Matlab is susceptible also and (except for Macs) it is not updated. See also https://www.mathworks.com/matlabcentral/answers/58642-security-implications-by-java.
This means:
  1. The Matlab prompt can be misused to access proprietary data, to send spam or start DDoS attacks.
  2. Compiled GUIs which eval user input are equivalent to a Matlab prompt.
"No, my software will not be used for evil things" is the typical wrong estimation, which allowed to convert millions of IoT light bulbs and internet routers into an attacking bot network.
My advice considering security implications: Never use eval for user input.
Samuel Clary
Samuel Clary on 20 Sep 2017
I understand why the use of eval() is typically frowned upon now. I never realized just how powerful of a function this was. I will be sure to place restrictions on eval() if I use it in the future. Thanks for this advice and clarification.

Sign in to comment.