Info

This question is closed. Reopen it to edit or answer.

Align a sequence longer than the model with hmmprofalign

1 view (last 30 days)
Hello, I want to use the hmmprofile functions to align a sequence of characters which are not aminoscids (or nucleotides), but a generic sequence of letters (short: 5 - 10 chars). I'm estimating the HMM from a set of examples with hmmprofestimate.
Now, the question is: when I try to align a new sequence to the model, if the sequence is longer than the model it gets trimmed to the model length.
e.g. if I have the sequence 'LRALPIT' and I try to align it to a model with 4 states, I get 'LRAL', the other characters are cut off. Also, the score of the alignment of 'LRAL' or 'LRALPIT' or 'LRALRL.....LR' (as long as you want) is the same.
I noticed that the model's InsertEmission field if entirely made of 0.05, which multiplied by 20 (the aa number) gives 1. And it makes sense because the sum of the probabilities must be 1, but it's the same number for every position in every state. I'm thinking that my sequence gets trimmed because the model does not place insert states (which would maybe make the sequence I wrote before to be aligned as 'LRALpit', with 'pit' insert emissions?).
Any suggestions?
Thanks in advance, Marco.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!