MATLAB Answers

NP4432
0

Reading within parentheses in MATLAB

Asked by NP4432
on 4 Apr 2017
Latest activity Commented on by Jan
on 5 Apr 2017
I have a 100 x 1 cell array where each element consists of a number followed by a space followed by a number in parentheses. For example: 401 (106), 234 (113), 345 (120)...
I would like to create a new 100 x 1 cell array where each element consists of solely the number within the parentheses. That is: 106, 113, 120...
If there a way to do this in MATLAB? I was thinking about scanning each element and reading only the data after "(" and before ")", but I don't know how to go about doing it.

  0 Comments

Sign in to comment.

3 Answers

Answer by Stephen Cobeldick on 4 Apr 2017
Edited by Stephen Cobeldick on 4 Apr 2017

This is very easy with regexp:
>> C = {'401 (106)', '234 (113)', '345 (120)'};
>> D = regexp(C,'(\d+)\s+\((\d+)\)','tokens','once');
>> str2double(cell2mat(D))
ans =
401 234 345
106 113 120

  1 Comment

Guillaume
on 4 Apr 2017
Or you could use this slightly less strict regex:
str2double(regexp(C, '(?<=\()[^)]*', 'match', 'once'))
matches everything after (| up to |).

Sign in to comment.


Answer by Jan
on 4 Apr 2017
Edited by Jan
on 4 Apr 2017

[EDITED] The former version replied both values, but the OP NP4432 asked for the value in parenthesis only.
C = {'401 (106)', '234 (113)', '345 (120)'};
D = sscanf(sprintf('%s ', C{:}), '%*g (%g)', [1, numel(C)]).';
R = num2cell(D, 2);
As usual a run-time comparison:
C = cell(1, 100); % Test data
for k = 1:numel(C), C{k} = '401 (106)'; end
tic
for k = 1:100
D = sscanf(sprintf('%s ', C{:}), '%*g (%g)', [1, numel(C)]).';
R1 = num2cell(D, 2);
end
toc
tic
for k = 1:100
% D = regexp(C,'(\d+)\s+\((\d+)\)','tokens','once');
% R2 = str2double(cell2mat(D)); % Does not run on my old 2009a
% R2 = str2double(cat(1, D{:}));
D = str2double(regexp(C, '(?<=\()[^)]*', 'match', 'once'));
R2 = num2cell(D, 1);
end
toc
tic
for k = 1:100
R3 = cellfun(@(x) sscanf(x, '%*g (%g)'), C, ...
'Uniformoutput', false);
end
toc
R2009a/64, Win7, 2 cores of a i5 in a VM:
Elapsed time is 0.187068 seconds. % SSCANF & SPRINTF
Elapsed time is 1.337538 seconds. % REGEXP
Elapsed time is 1.207764 seconds. % CELLFUN(SSCANF)

  4 Comments

Show 1 older comment
Jan
on 4 Apr 2017
@Stephen: The OP wants the number inside the parenthesis only. I had overseen this at first.
If speed matters: FEX: CStr2String:
num2cell(sscanf(CStr2String(C), '%*d (%d)'), 2)
It seems like [C{:}] has problems with the pre-allocation.
"The OP wants the number inside the parenthesis only"
Sure, and this is trivial with indexing:
>> C = {'401 (106)', '234 (113)', '345 (120)'};
>> V = sscanf([C{:}],'%d (%d)');
>> V(2:2:end)
Jan
on 5 Apr 2017
@Stephen: I have the old R2009a only during the day. I assume REGEXP of modern Matlab versions is faster. Could you post the output of the speed comparison? Thanks.

Sign in to comment.


Answer by Joseph Cheng
on 4 Apr 2017
Edited by Joseph Cheng
on 4 Apr 2017

since you've got the right idea and cellfun() is troublesome and sometimes confusing, i've coded it for you here:
dumdata = {'401 (106)',' 234 (113)',' 345 (120)'};
cellfun(@(x) str2double(x(find(x=='(')+1:find(x==')')-1)),dumdata,'Uniformoutput',false)
cellfun applies the function for each cell of the supplied cell array. otherwise this can be done in a for loop with implementing the what is shown in the anonymous function.

  2 Comments

Jan
on 4 Apr 2017
@Joseph: I get the error:
??? Index exceeds matrix dimensions.
Error in ==> @(x)str2double(x(find(x=='(')+1:find(x==')')+1))
oopse. that should be -1, clipboard didn't update i guess when i pasted in my script.

Sign in to comment.