Skip to Main Content Skip to Search
Product Documentation

pfamhmmread - Read data from PFAM HMM-formatted file

Syntax

HMMStruct = pfamhmmread(File)

Input Arguments

File

Either of the following:

  • String specifying a file name, a path and file name, or a URL pointing to a file. The referenced file is a PFAM HMM-formatted file. If you specify only a file name, that file must be on the MATLAB search path or in the current folder.

  • MATLAB character array that contains the text of a PFAM-HMM-formatted file.

    Tip   You can use the gethmmprof function with the 'ToFile' property to retrieve HMM profile information from the PFAM database and create a PFAM HMM-formatted file.

Output Arguments

HMMStructMATLAB structure containing information from a PFAM HMM-formatted file.

Description

HMMStruct = pfamhmmread(File) reads File, a PFAM HMM-formatted file, and converts it to HMMStruct, a MATLAB structure containing the following fields corresponding to parameters of an HMM profile:

FieldDescription
NameThe protein family name (unique identifier) of the HMM profile record in the PFAM database.
PfamAccessionNumberThe protein family accession number of the HMM profile record in the PFAM database.
ModelDescriptionDescription of the HMM profile.
ModelLengthThe length of the profile (number of MATCH states).
AlphabetThe alphabet used in the model, 'AA' or 'NT'.

    Note   AlphaLength is 20 for 'AA' and 4 for 'NT'.

MatchEmission

Symbol emission probabilities in the MATCH states.

The format is a matrix of size ModelLength-by-AlphaLength, where each row corresponds to the emission distribution for a specific MATCH state.

InsertEmission

Symbol emission probabilities in the INSERT state.

The format is a matrix of size ModelLength-by-AlphaLength, where each row corresponds to the emission distribution for a specific INSERT state.

NullEmission

Symbol emission probabilities in the MATCH and INSERT states for the NULL model.

The format is a 1-by-AlphaLength row vector.

    Note   NULL probabilities are also known as the background probabilities.

BeginX

BEGIN state transition probabilities.

Format is a 1-by-(ModelLength + 1) row vector:

[B->D1 B->M1 B->M2 B->M3 .... B->Mend]
MatchX

MATCH state transition probabilities.

Format is a 4-by-(ModelLength - 1) matrix:

[M1->M2 M2->M3 ... M[end-1]->Mend;
 M1->I1 M2->I2 ... M[end-1]->I[end-1];
 M1->D2 M2->D3 ... M[end-1]->Dend;
 M1->E  M2->E  ... M[end-1]->E  ]
InsertX

INSERT state transition probabilities.

Format is a 2-by-(ModelLength - 1) matrix:

[ I1->M2 I2->M3 ... I[end-1]->Mend;
  I1->I1 I2->I2 ... I[end-1]->I[end-1] ]
DeleteX

DELETE state transition probabilities.

Format is a 2-by-(ModelLength - 1) matrix:

[ D1->M2 D2->M3 ... D[end-1]->Mend ;
  D1->D2 D2->D3 ... D[end-1]->Dend ]
FlankingInsertX

Flanking insert states (N and C) used for LOCAL profile alignment.

Format is a 2-by-2 matrix:

[N->B  C->T ;
 N->N  C->C]
LoopX

Loop states transition probabilities used for multiple hits alignment.

Format is a 2-by-2 matrix:

[E->C  J->B ;
 E->J  J->J]
NullX

Null transition probabilities used to provide scores with log-odds values also for state transitions.

Format is a 2-by-1 column vector:

[G->F ; G->G]

For more information on HMM profile models, see HMM Profile Model.

Examples

Read a locally saved PFAM HMM-formatted file into a MATLAB structure.

pfamhmmread('pf00002.ls')

ans = 

                   Name: '7tm_2'
    PfamAccessionNumber: 'PF00002.15'
       ModelDescription: '7 transmembrane receptor (Secretin family)'
            ModelLength: 293
               Alphabet: 'AA'
          MatchEmission: [293x20 double]
         InsertEmission: [293x20 double]
           NullEmission: [1x20 double]
                 BeginX: [294x1 double]
                 MatchX: [292x4 double]
                InsertX: [292x2 double]
                DeleteX: [292x2 double]
        FlankingInsertX: [2x2 double]
                  LoopX: [2x2 double]
                  NullX: [2x1 double]

See Also

gethmmalignment | gethmmprof | hmmprofalign | hmmprofstruct | showhmmprof

  


Free Computational Biology Interactive Kit

See how to analyze, visualize, and model biological data and systems using MathWorks products.

Get free kit

Trials Available

Try the latest computational biology products.

Get trial software
 © 1984-2012- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS