# I want to convert a character series into numerical series using for loop

1 view (last 30 days)
S Kar on 8 Jun 2022
Commented: S Kar on 8 Jun 2022
I have a character sequence stored in variable DNA_SEQS = 'AGGTAT.....'. The sequence consists of four type of character 'A', 'C', 'T' & 'G', therefore I have used swith case to generate the numerical sequence. The code I have written is:
DNA_SEQS = seqs.Sequence;
len = length(DNA_SEQS);
for j = 1:5
x = [];
a = DNA_SEQS(j);
switch a
case 'A'
v = 0;
case 'C'
v = 1;
case 'G'
v = 2;
case 'T'
v = 3;
end
x(j+1) = [x(j) v];
end
By using this code I supposed to get a numerical array like [0,2,2,3,0] but I got an error as: Index exceeds matrix dimensions.

dpb on 8 Jun 2022
Edited: dpb on 8 Jun 2022
for j = 1:5
x = [];
a = DNA_SEQS(j);
...
You wipe out what you put in x later every time you start through the loop again...don't do that!!! :)
x = [];
for j = 1:5
a = DNA_SEQS(j);
...
2. size x() based on the length of the string, not hardcode the loop count
N=strlength(DNA_SEQS);
x=zeros(1,N);
for j = 1:N
a = DNA_SEQS(j);
...
However, in MATLAB you don't need a loop; use a lookup table instead. One way (not necessarily the fastest, but pretty easy to code) would be
DNA_VALS=interp1(double('ACGT'),0:3,double(DNA_SEQS));
This would return for your sample above...
>> DNA_SEQS = 'AGGTAT';
DNA_VALS=interp1(double('ACGT'),0:3,double(DNA_SEQS))
DNA_VALS =
0 2 2 3 0 3
>>
S Kar on 8 Jun 2022
Thank you but still got this error using the first method:
In an assignment A(:) = B, the number of elements in A and B must be the same.
The second method is working fine

DGM on 8 Jun 2022
You can use ismember():
thisstr = 'AGGATATC';
charmap = 'ACGT';
[~,idx] = ismember(thisstr,charmap);
idx = idx-1
idx = 1×8
0 2 2 0 3 0 3 1
S Kar on 8 Jun 2022
Thank you so much for the elaboration.