Count the 'duration' of integer values appearing sequentially in a list (viterbi path)

1 view (last 30 days)
I have a long (~250 000) sequence of integers between 1 and 10 which represent a sequence of states (viterbi path) which is the output of a hidden markov model.
I would like to get the durations of each individual state and put these into a matrix (or perhaps several vectors as they may have different lengths, I'm not sure how to deal with this).
an example vpath might be [1 1 1 2 2 3 3 2 2 1 1 1 1 1 4 4 4] and I would like to return something like so
3 5 ##first and second durations of state 1
2 2 ## first and second durations of state 2 etc.
2
3
Here is the code I have so far (the appraoch is really ugly and gives me weird numbers at the beggining and end, sorry that you have to look at it):
%should return array of length K with ave. number of samples per state visit
function visits = state_duration(vpath, K)
visits = [];
a = zeros(length(vpath), K);
b = ones(K,1);
visits.avgs = zeros(K,1);
for x = 1:K
for i = 1:length(vpath)
if vpath(i)==x
a(b,x)=a(b,x)+1;
else
b(x)=(b(x)+1);
end
end
rws = a(2:end,x);
visits.avgs(x)=mean(rws(rws~=0));
end
visits.full = a(2:end,:);
end
**if anyone can think of a better way to phrase this question let me know

Accepted Answer

Geoff Hayes
Geoff Hayes on 7 Feb 2016
Alexander - please clarify what the purpose of b is. If K is the number of possible integers (so 10) then b is a 10x1 array which you then use in
a(b,x) = a(b,x) + 1;
I don't understand the intent of the above line.
If you wish to just iterate over all elements in your list and count the number of repeating integers, then you may just want to have a single for loop (rather than the double one from above) and iterate as
K = 10;
a = cell(1,K);
totalReps = 1; % total number of consecutive repetitions of an integer
for m=2:length(vpath)
if vpath(m) == vpath(m-1)
totalReps = totalReps + 1;
else
k = vpath(m-1);
a{k} = [a{k} ; totalReps];
totalReps = 1;
end
end
a{vpath(end)} = [a{vpath(end)} ; totalReps];
We use a cell array where each element represents one of the K integers. Each of these elements will be an integer array with all of the durations. For example,
a{1}
ans =
3
5
or
a{2}
ans =
2
2
You can then determine the average of the durations for each K.
  1 Comment
Alexander Morley
Alexander Morley on 7 Feb 2016
It was a hacky way of adding one each time an occurance of an integer occurred. You're answer (and the one below) is much more elegant. I will accept one of them tomorrow when I have access to my other computer.

Sign in to comment.

More Answers (1)

Jan
Jan on 7 Feb 2016
Perhaps you mean this:
[B, N] = RunLength(vpath);
Result = cell(1, K);
for ik = 1:K
Rsult{ik} = N(B == ik);
end

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!