Skip to Main Content Skip to Search
Product Documentation

hmmprofstruct - Create or edit hidden Markov model (HMM) profile structure

Syntax

Model = hmmprofstruct(Length)
Model = hmmprofstruct(Length, Field1, Field1Value, Field2, Field2Value, ...)
NewModel = hmmprofstruct(Model, Field1, Field1Value, Field2, Field2Value, ...)

Input Arguments

LengthNumber of match states in the model.
ModelMATLAB structure containing fields for the parameters of an HMM profile created with the hmmprofstruct function.
FieldString containing a field name in the structure Model. See the table below for field names.
FieldValueValue associated with Field. See the table below for descriptions.

Output Arguments

ModelMATLAB structure containing fields for the parameters of an HMM profile.

Description

Model = hmmprofstruct(Length) returns Model, a MATLAB structure containing fields for the parameters of an HMM profile. Length specifies the number of match states in the model. All other required parameters are set to the default values.

Model = hmmprofstruct(Length, Field1, Field1Value, Field2, Field2Value, ...) returns an HMM profile structure using the specified parameters. All other required parameters are set to default values.

NewModel = hmmprofstruct(Model, Field1, Field1Value, Field2, Field2Value, ...) returns an updated HMM profile structure using the specified parameters. All other parameters are taken from the input Model.

HMM Profile Structure

The MATLAB structure Model contains the following fields, which are the required and optional parameters of an HMM profile. All probability values are in the [0 1] range.

Field Description
ModelLengthInteger specifying the length of the profile (number of MATCH states).
AlphabetString specifying the alphabet used in the model. Choices are 'AA' (default) or 'NT'.

    Note   AlphaLength is 20 for 'AA' and 4 for 'NT'.

MatchEmission

Symbol emission probabilities in the MATCH states.

Either of the following:

  • A matrix of size ModelLength-by-AlphaLength, where each row corresponds to the emission distribution for a specific MATCH state. Defaults to uniform distributions.

  • A structure containing residue counts, such as returned by aacount or basecount.

InsertEmission

Symbol emission probabilities in the INSERT state.

Either of the following:

  • A matrix of size ModelLength-by-AlphaLength, where each row corresponds to the emission distribution for a specific INSERT state. Defaults to uniform distributions.

  • A structure containing residue counts, such as returned by aacount or basecount.

NullEmission

Symbol emission probabilities in the MATCH and INSERT states for the NULL model.

Either of the following:

  • A 1-by-AlphaLength row vector. Defaults to a uniform distribution.

  • A structure containing residue counts, such as returned by aacount or basecount.

    Note   The NULL model is used to compute the log-odds ratio at every state and avoid overflow when propagating the probabilities through the model.

    Note   NULL probabilities are also known as the background probabilities.

BeginX

BEGIN state transition probabilities.

Format is a 1-by-(ModelLength + 1) row vector:

[B->D1 B->M1 B->M2 B->M3 .... B->Mend]

    Note   If necessary, hmmprofstruct will normalize the data such that the sum of the transition probabilities from the BEGIN state equals 1:

    sum(Model.BeginX) = 1
    

    For fragment profiles:

    sum(Model.BeginX(3:end)) = 0
    

Default is [0.01 0.99 0 0 ... 0].

MatchX

MATCH state transition probabilities.

Format is a 4-by-(ModelLength - 1) matrix:

[M1->M2 M2->M3 ... M[end-1]->Mend;
 M1->I1 M2->I2 ... M[end-1]->I[end-1];
 M1->D2 M2->D3 ... M[end-1]->Dend;
 M1->E  M2->E  ... M[end-1]->E  ]

    Note   If necessary, hmmprofstruct will normalize the data such that the sum of the transition probabilities from every MATCH state equals 1:

    sum(Model.MatchX) = [ 1 1 ... 1 ]
    

    For fragment profiles:

    sum(Model.MatchX(4,:)) = 0
    

Default is repmat([0.998 0.001 0.001 0],ModelLength-1,1).

InsertX

INSERT state transition probabilities.

Format is a 2-by-(ModelLength - 1) matrix:

[ I1->M2 I2->M3 ... I[end-1]->Mend;
  I1->I1 I2->I2 ... I[end-1]->I[end-1] ]

    Note   If necessary, hmmprofstruct will normalize the data such that the sum of the transition probabilities from every INSERT state equals 1:

    sum(Model.InsertX) = [ 1 1 ... 1 ]
    

Default is repmat([0.5 0.5],ModelLength-1,1).

DeleteX

DELETE state transition probabilities.

Format is a 2-by-(ModelLength - 1) matrix:

[ D1->M2 D2->M3 ... D[end-1]->Mend ;
  D1->D2 D2->D3 ... D[end-1]->Dend ]

    Note   If necessary, hmmprofstruct will normalize the data such that the sum of the transition probabilities from every DELETE state equals 1:

    sum(Model.DeleteX) = [ 1 1 ... 1 ]

Default is repmat([0.5 0.5],ModelLength-1,1).

FlankingInsertX

Flanking insert states (N and C) used for LOCAL profile alignment.

Format is a 2-by-2 matrix:

[N->B  C->T ;
 N->N  C->C]

    Note   If necessary, hmmprofstruct will normalize the data such that the sum of the transition probabilities from Flanking Insert states equals 1:

    sum(Model.FlankingInsertsX) = [1 1]

    Note   To force global alignment use:

    Model.FlankingInsertsX = [1 1; 0 0]

Default is [0.01 0.01; 0.99 0.99].

LoopX

Loop states transition probabilities used for multiple hits alignment.

Format is a 2-by-2 matrix:

[E->C  J->B ;
 E->J  J->J]

    Note   If necessary, hmmprofstruct will normalize the data such that the sum of the transition probabilities from Loop states equals 1:

    sum(Model.LoopX) = [1 1] 

Default is [0.5 0.01; 0.5 0.99].

NullX

Null transition probabilities used to provide scores with log-odds values also for state transitions.

Format is a 2-by-1 column vector:

[G->F ; G->G]

    Note   If necessary, hmmprofstruct will normalize the data such that the sum of the transition probabilities from Null states equals 1:

    sum(Model.NullX) = 1

Default is [0.01; 0.99].

IDNumberOptional. User-assigned identification number.
DescriptionOptional. User-assigned description of the model.

HMM Profile Model

An HMM profile model is a common statistical tool for modeling structured sequences composed of symbols. These symbols include randomness in both the output (emission of symbols) and the state transitions of the process. Markov models are generally represented by state diagrams.

The following figure is a state diagram for an HMM profile of length four. INSERT, MATCH, and DELETE states are in the center section.

Flanking states (S, N, B, E, C, T) are used for proper modeling of the ends of the sequence, either for global, local or fragment alignment of the profile. S, B, E, and T are silent, while N and C are used to insert symbols at the flanks.

Examples

Creating an HMM Profile Structure

Create an HMM profile structure with 100 MATCH states, using the amino acid alphabet.

hmmProfile = hmmprofstruct(100,'Alphabet','AA')

hmmProfile = 

        ModelLength: 100
           Alphabet: 'AA'
      MatchEmission: [100x20 double]
     InsertEmission: [100x20 double]
       NullEmission: [1x20 double]
             BeginX: [101x1 double]
             MatchX: [99x4 double]
            InsertX: [99x2 double]
            DeleteX: [99x2 double]
    FlankingInsertX: [2x2 double]
              LoopX: [2x2 double]
              NullX: [2x1 double]

Editing an HMM Profile Structure

  1. Use the pfamhmmread function to create an HMM profile structure from pf00002.ls, a PFAM HMM-formatted file included with the software.

    hmm02 = pfamhmmread('pf00002.ls');
  2. Modify the HMM profile structure to force a global alignment by setting the looping transition probabilities in the flanking insert states to zero.

    hmm02 = hmmprofstruct(hmm02,'FlankingInsertX',[0 0;1 1]);
    hmm02.FlankingInsertX
    
    ans =
    
         0     0
         1     1

See Also

aacount | basecount | gethmmprof | hmmprofalign | hmmprofestimate | hmmprofgenerate | hmmprofmerge | pfamhmmread | showhmmprof

  


Free Computational Biology Interactive Kit

See how to analyze, visualize, and model biological data and systems using MathWorks products.

Get free kit

Trials Available

Try the latest computational biology products.

Get trial software
 © 1984-2012- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS