Comparing two cell arrays of strings of different sizes

77 views (last 30 days)
Dear all,
I'm trying to compare two cell arrays of strings, a lot like the following:
A = [X_ABCDE;X_BCDEA;X_BCED]
B = [X_A;X_B]
and the resulting array should be:
index = [1;2;2]
as I have a matrix D = [B, C] with C containing the cells I want to end up with.
I wanted to do it using a for-loop and a nested for-loop, but, as A is 8000x1 and D is 2000x2, this will take forever. I tried using strcmp and ismember, but these only work when the cells (or strings) are identical. strfind doesn't work either, because when I use strfind I need to have a for-loop over array A, and strfind only works if the smallest of both is a string.
for iCell = 1:length(A)
index = strfind(A(iCell), cell2mat(D(iCell,1)));
end
A string in B is ALWAYS shorter than in A and a string in B can occur more than once in A. I hope you can help me out :)
  2 Comments
Stephen23
Stephen23 on 27 Oct 2015
Edited: Stephen23 on 27 Oct 2015
Your state that: "I have a matrix D = [B, C] with C containing the cells I want to end up with". but if the matrix D is an output (i.e. is defined by what you "end up with"), then how can D be accessed within the loop?: D(iCell,1). How does D get defined if C is an output?
Jasper Admiraal
Jasper Admiraal on 27 Oct 2015
Sorry, I got it wrong. I want to end up with a certain order of the strings from C, indicated by the array index. In my code, a cell in B contains a code which corresponds to a specification in C (in the same row).

Sign in to comment.

Accepted Answer

Stephen23
Stephen23 on 27 Oct 2015
Edited: Stephen23 on 27 Oct 2015
Use strncmp (not strcmp) to make the code simpler. Loop over the smaller cell array B, and pass the larger cell array A whole to strncmp:
A = {'X_ABCDE';'X_BCDEA';'X_BCED'}
B = {'X_A';'X_B'}
%
X = zeros(size(A));
for k = 1:numel(B)
X(strncmp(A,B{k},3)) = k;
end
produces:
>> X
X =
1
2
2
Do not use arrayfun or cellfun if you want fast code: they will be slower than a for-loop.

More Answers (1)

Jos (10584)
Jos (10584) on 27 Oct 2015
assuming B is all unique
A = {'X_ABCDE' ; 'X_BCDEA' ; 'X_BCE'}
B = {'X_A' ; 'X_B'}
index = arrayfun(@(k) find(strncmp(A{k},B,3)), 1:number(A))
  2 Comments
Stephen23
Stephen23 on 27 Oct 2015
Edited: Stephen23 on 27 Oct 2015
My solution is much faster than this one (100 iterations)
Elapsed time is 0.0370002 seconds. % my answer
Elapsed time is 1.623 seconds. % this answer
Using arrayfun is slower than using a loop.
Jos (10584)
Jos (10584) on 28 Oct 2015
of course it is. I just wanted to show a different answer ... :-)

Sign in to comment.

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!