4.0

4.0 | 2 ratings Rate this file 35 downloads (last 30 days) File Size: 1.68 KB File ID: #24559

Longest Common Substring

by David Cumin

 

26 Jun 2009 (Updated 29 Jun 2009)

Code covered by BSD License  

Gives the longest common substring between two stings.

Download Now | Watch this File

File Information
Description

%%%INPUT
%%%X, Y - both are strings e.g. 'test' or 'stingtocompare'
%%%OUTPUT
%%%D is the substring over the length of the shortest string
%%%dist is the length of the substring
%%%aLongestString is a sting of length dist (only one of potentially many)

MATLAB release MATLAB 7.3 (R2006b)
Zip File Content  
Other Files LCS.m,
license.txt
Tags for This File  
Everyone's Tags
Tags I've Applied
Add New Tags Please login to tag files.
Comments and Ratings (6)
26 Jun 2009 Bruno Luong

Code seems Buggy, even for the example provided

>> X='test', Y='stingtocompare'

X =

test

Y =

stingtocompare

>> [D, dist, aLongestString] = LCS(X,Y)
??? Attempted to access b(2,0); index must be a positive integer or logical.

Error in ==> LCS at 55
    if(b(i,j) == 3)
 
>>

29 Jun 2009 Matt Fig

Something is still wrong.

[D,G,S] = LCS('fbce','abcde');S
S =
bce

29 Jun 2009 Bruno Luong

I still do not function understand what the function does:

X = [ 8 8 9 3 7 1 2 4 5 1 ]
Y = [ 7 6 6 10 2 7 9 4 1 1]

[D, dist, aLongestString] = LCS(X,Y)
D = 0.6000
dist = 4
aLongestString = [ 7 2 4 1]

% Furthermore what is the limit of the function?

>> X=ceil(10*rand(1,1e4));
>> Y=ceil(10*rand(1,1e4));
>> [D, dist, aLongestString] = LCS(X,Y)
??? Out of memory. Type HELP MEMORY for your options.

Error in ==> LCS at 20
b = zeros(n+1,m+1);
  

29 Jun 2009 David Cumin

Bruno,
Thanks for your comments. The LCS('fbce','abcde'); gives the right answer. The code is designed to find the longest common substring of two given inputs. In this example, both 'fbce' and 'abcde' contain 'bce':
fbce -> '-bce'
'abcde' -> '-bc-e'
Hope that makes sense.

The technique is common to pattern matching techniques. I'm not sure of the limit to the function. I guess it depends on memory.

Cheers,
David

30 Jun 2009 Bruno Luong

Thank you, I see now. I suggest to update the help and describe more clearly what function does. It is also good if you could add few words about memory requirement/complexity and algorithm.

03 Jul 2009 Bruno Luong  
Please login to add a comment or rating.
Updates
28 Jun 2009

Sorry - a simple operator change fixed that. Should work now. The answer to your 'test' 'stingtocompare' is [0.5 2 'st'].
Thanks for pointing out the error!

28 Jun 2009

Now will also work for integer inputs (not only strings).

29 Jun 2009

Included support for 0 similarity

Tag Activity for this File
Tag Applied By Date/Time
similarity string lcs data David Cumin 26 Jun 2009 11:07:29
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com