File Exchange

## CStrAinBP

version 1.1 (37.3 KB) by

Overlapping elements of 2 cell strings. 10-20 times faster than INTERSECT/ISMEMBER/SETDIFF.

Updated

Find overlap of 2 cell strings.
This can be used for a faster calculation of:
INTERSECT, ISMEMBER, SETDIFF and UNION.

Comparison to Matlab's INTERSECT:
- Consider repeated strings (no internal UNIQUE)
- No sorting
- Can be insensitive for upper/lower case
- M-version as demonstration of the method
- MEX-version: 90% to 98% faster than INTERSECT

[AI, BI] = CStrAinBP(A, B, CaseSensitive)
INPUT:
A, B: Cell strings.
CaseSensitive: Optional string to trigger sensitivity for case.
OUTPUT:
AI: Indices of common strings in A.
Each occurence of repeated strings is considered.
AI is sorted from low to high indices.
BI: Indices of common strings in B.
If B is not unique, the first occurrence of a string is used.
such that A{AI} == B{BI}.

EXAMPLES:
[AI, BI] = CStrAinBP({'a', 'b', 'q', 'a'}, {'a', 'c', 'd', 'a', 'b'})
replies: AI = [1, 2, 4] and: BI = [1, 5, 1]

[AI, BI] = CStrAinBP({'a', 'b', 'A'}, {'a', 'c', 'a', 'B', 'b'}, 'i')
replies: AI = [1, 2, 3] and: BI = [1, 4, 1]

INCLUDED FILES:
CStrAinBP.m: Proof of concept, demonstration.
CStrAinBP.C: Fast MEX function.
CStrAinBP.MEXW32: Compiled for Matlab 7 with LCC3.8.
Matlab6/CStrAinBP.DLL: Compiled for Matlab 6 with BCC5.5.
For Matlab 6, replace the MEXW32 file by this DLL.
TestCStrAinBP: Run the test after installation or compiling.

Tested: Matlab 6.5, 7.7, 7.8, Win2K/XP

Jan Simon

### Jan Simon (view profile)

@Hoi Wong:
Intersect:
AB = A(CStrAinBP(A,B));
For SETDIFF, UNION and SETXOR an equivalent tool CStrisAinB is more useful, which replies a logical vector. I'm going to publish it soon.

Hoi Wong

Hoi Wong

### Hoi Wong (view profile)

Great tool. Can you suggest how to implement setops (intersect/setdiff/union/setxor) using CStrAinBP that has the same behavior as the native setops that does unique() without any major hit in performance?

Thanks.

Rodrigo Fernandes

### Rodrigo Fernandes (view profile)

hello Jan,

nice contribution. I have only a question: what if i want to compare a = {{'a'},{'b'},{'c'}} ; b = {{'a'},{'b'},{'c'}, {'d'}}

?

best wishes

Rodrigo

Monchai Trakulpoochai

### Monchai Trakulpoochai (view profile)

It's suited with my project! Great file!

Excellent contribution, saving me a lot of time !

Michael

### Michael (view profile)

Works well- thanks!