MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn moreOpportunities for recent engineering grads.

Apply Today
Asked by googo on 16 Apr 2013

Hello,

I'm writing a function which returns the sequences of n requested letters in a string. For example, :

string = 'abcad' , n=3

ans =

abc bca cad

Well, my code is :

function s=ngramsFreq(string,n) k=0; t = repmat(char(0),length(string)-n+1,n); for i=1:(length(string)-n)+1 k=k+1; for j=1:n for m=k:(k+n)-1 t(i,j)=string(m); break end end end s=t; end

When I'm runing the program for 'abcad', n=3 the function returns:

ngramsFreq('abcad',3)

ans =

aaa bbb ccc

I think the problem is in the inside loop:

for m=k:(k+n)-1 t(i,j)=string(m); break end

I want it to end after one step and not keep looping. For example After t(1,1)=string(1) go to j=2 and not return to m=1.

Any help with this? thank you very much!

Note: For the meantime, if a sequence is showing twice or more the function will return both sequences or how many there is (for example: 'aaa' will return [aa;aa].

*No products are associated with this question.*

Answer by Andrei Bobrov on 16 Apr 2013

Edited by Andrei Bobrov on 16 Apr 2013

Accepted answer

str = 'abcad' ; n=3; out = str( hankel(1:n,n:numel(str)) );

without `hankel`

out = str( bsxfun(@plus,1:n,(0:numel(str) - n)') )

with `while`-loop

g = 1:n; out = []; while g(end) <= numel(str) out = [out;str(g)]; g = g + 1; end

Answer by Yao Li on 16 Apr 2013

function s=ngramsFreq(string,n) k=0; t = repmat(char(0),length(string)-n+1,n); for i=1:(length(string)-n)+1 k=k+1;

t(i,1:n)=string(k:1:(k+n)-1);

end s=t; end

Show 4 older comments

Yao Li on 16 Apr 2013

You can divide your algorithm to 2 steps: 1st find the specific group of letters. 2nd remove the repeated groups

## 0 Comments