Code covered by the BSD License

Highlights from CStrAinBP

5.0
5.0 | 4 ratings Rate this file 11 Downloads (last 30 days) File Size: 37.3 KB File ID: #24380 Version: 1.1

CStrAinBP

Jan Simon (view profile)

09 Jun 2009 (Updated )

Overlapping elements of 2 cell strings. 10-20 times faster than INTERSECT/ISMEMBER/SETDIFF.

File Information
Description

Find overlap of 2 cell strings.
This can be used for a faster calculation of:
INTERSECT, ISMEMBER, SETDIFF and UNION.

Comparison to Matlab's INTERSECT:
- Consider repeated strings (no internal UNIQUE)
- No sorting
- Can be insensitive for upper/lower case
- M-version as demonstration of the method
- MEX-version: 90% to 98% faster than INTERSECT

[AI, BI] = CStrAinBP(A, B, CaseSensitive)
INPUT:
A, B: Cell strings.
CaseSensitive: Optional string to trigger sensitivity for case.
OUTPUT:
AI: Indices of common strings in A.
Each occurence of repeated strings is considered.
AI is sorted from low to high indices.
BI: Indices of common strings in B.
If B is not unique, the first occurrence of a string is used.
such that A{AI} == B{BI}.

EXAMPLES:
[AI, BI] = CStrAinBP({'a', 'b', 'q', 'a'}, {'a', 'c', 'd', 'a', 'b'})
replies: AI = [1, 2, 4] and: BI = [1, 5, 1]

[AI, BI] = CStrAinBP({'a', 'b', 'A'}, {'a', 'c', 'a', 'B', 'b'}, 'i')
replies: AI = [1, 2, 3] and: BI = [1, 4, 1]

INCLUDED FILES:
CStrAinBP.m: Proof of concept, demonstration.
CStrAinBP.C: Fast MEX function.
CStrAinBP.MEXW32: Compiled for Matlab 7 with LCC3.8.
Matlab6/CStrAinBP.DLL: Compiled for Matlab 6 with BCC5.5.
For Matlab 6, replace the MEXW32 file by this DLL.
TestCStrAinBP: Run the test after installation or compiling.

Tested: Matlab 6.5, 7.7, 7.8, Win2K/XP

MATLAB release MATLAB 7.8 (R2009a)
Other requirements Works under Matlab 6.5.1 also.
30 Dec 2014 Jan Simon

Jan Simon (view profile)

@Hoi Wong:
Intersect:
AB = A(CStrAinBP(A,B));
For SETDIFF, UNION and SETXOR an equivalent tool CStrisAinB is more useful, which replies a logical vector. I'm going to publish it soon.

Comment only
25 Nov 2014 Hoi Wong

Hoi Wong (view profile)

25 Nov 2014 Hoi Wong

Hoi Wong (view profile)

Great tool. Can you suggest how to implement setops (intersect/setdiff/union/setxor) using CStrAinBP that has the same behavior as the native setops that does unique() without any major hit in performance?

Thanks.

Comment only
23 Aug 2012 Rodrigo Fernandes

Rodrigo Fernandes (view profile)

hello Jan,

nice contribution. I have only a question: what if i want to compare a = {{'a'},{'b'},{'c'}} ; b = {{'a'},{'b'},{'c'}, {'d'}}

?

best wishes

Rodrigo

Comment only
11 Apr 2012 Monchai Trakulpoochai

Monchai Trakulpoochai (view profile)

It's suited with my project! Great file!

Excellent contribution, saving me a lot of time !

01 Jan 2010 Michael

Michael (view profile)

Works well- thanks!