MATLAB Answers

## Find count of repeated letters (sequence)

Asked by Jothi

### Jothi (view profile)

on 8 Oct 2013
Latest activity Commented on by Sean de Wolski

### Sean de Wolski (view profile)

on 10 Oct 2013

Sir,

How to find the no. of repeated sequence (letters) in the given sentence.

for example, a="I want THAAAAAT APPPPPLE ):):): totally unprepared";

The No. of repeated sequences are: 3

ie.,

1. THAAAAAT

2. APPPPPLE

3. ):):):

thanks

Jothi

### Jothi (view profile)

on 9 Oct 2013

sir,

Repeated sequence is not an adjacent letter. It can be any letter or special character continuously repeated more than two times.

ie., In the word THAAAAAT a letter A is continuously repeated more than two times.

similarly, In the word APPPPPLE a letter P is continuously repeated more than two times.

how to find this.

thank you.

Walter Roberson

### Walter Roberson (view profile)

on 9 Oct 2013

Your #3, ):):): does not have continuously repeated symbols.

If the repeated sequences are to be identified, then why would all of THAAAAAT be output, and not just AAAAA ?

Jothi

### Jothi (view profile)

on 9 Oct 2013

:) this symbol indicates one type of emotion symbol (positive emotion).

I don't want the output as string just find the no. of repeated sequences are appeared in the given sentence. ie.,

input is,

a="I want THAAAAAT APPPPPLE ):):): totally unprepared";

output is,

No. of repeated sequences are: 3

## Products

No products are associated with this question.

## 1 Answer

### Cedric Wannaz (view profile)

Answer by Cedric Wannaz

### Cedric Wannaz (view profile)

on 8 Oct 2013
Edited by Cedric Wannaz

### Cedric Wannaz (view profile)

on 8 Oct 2013

Try to understand the following and fine-tune it to your needs:

` n = sum( diff([0, diff(a)==0]) == 1 )`

In particular, evaluate

` diff(a)==0`

and see how your problem actually translates into counting clusters of the outcome of diff(a)==0.

Jothi

### Jothi (view profile)

on 10 Oct 2013

sir,

the repeated sequences are more than two.

AAAAA

PPPPP

ll - not more than two (in 'totally')

thank you sir.

Cedric Wannaz

### Cedric Wannaz (view profile)

on 10 Oct 2013

You seem to indicate that one repeated sequence is '):'. As far as I am concerned, there is no simple generic solution if you want to detect repeated, arbitrary patterns. To illustrate,

` 'AABBCCDDEEFFAABBCCDDEEFF'`

Here, repeated patterns are

```'AA', 'BB', .., 'FF', 'AABB', 'BBCC', .., 'AABBCC', 'BBCCDD', ..,
'AABBCCDD', 'BBCCDDEE', .., 'AABBCCDDEEFF'
```

Using regular expressions, we can probably get some solution but it will be prohibitively time consuming.

Sean de Wolski

### Sean de Wolski (view profile)

on 10 Oct 2013

Yeah, every emoticon would have to be predefined. For the chatroom we use here there are even word emoticons like (b) which inserts a frosty beer mug or (ply) which inserts an image of a playing card.

#### Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply today