seqlogo - Display sequence logo for nucleotide or amino acid sequences

Syntax

seqlogo(Seqs)
seqlogo(Profile)
WgtMatrix = seqlogo(...)
[WgtMatrix, Handle] = seqlogo(...)

seqlogo(..., 'Displaylogo', DisplaylogoValue, ...)
seqlogo(..., 'Alphabet', AlphabetValue, ...)
seqlogo(..., 'Startat', StartatValue, ...)
seqlogo(..., 'Endat', EndatValue, ...)
seqlogo(..., 'SSCorrection', SSCorrectionValue, ...)

Arguments

Seqs

Set of pairwise or multiply aligned nucleotide or amino acid sequences, represented by any of the following:

  • Character array

  • Cell array of strings

  • Array of structures containing a Sequence field

Profile

Sequence profile distribution matrix with the frequency of nucleotides or amino acids for every column in the multiple alignment, such as returned by the seqprofile function.

The size of the frequency distribution matrix is:

  • For nucleotides — [4 x sequence length]

  • For amino acids — [20 x sequence length]

If gaps were included, Profile may have 5 rows (for nucleotides) or 21 rows (for amino acids), but seqlogo ignores gaps.

DisplaylogoValue

Controls the display of a sequence logo. Choices are true (default) or false.

AlphabetValue

String specifying the type of sequence (nucleotide or amino acid). Choices are 'NT' (default) or'AA'.

StartatValue

Positive integer that specifies the starting position for the sequences in Seqs. Default starting position is 1.

EndatValue

Positive integer that specifies the ending position for the sequences in Seqs. Default ending position is the maximum length of the sequences in Seqs.

SSCorrectionValue

Controls the use of small sample correction in the estimation of the number of bits. Choices are true (default) or false.

Return Values

WgtMatrixCell array containing the symbol list in Seqs or Profile and the weight matrix used to graphically display the sequence logo.
HandleHandle to the sequence logo figure.

Description

seqlogo(Seqs) displays a sequence logo for Seqs, a set of aligned sequences. The logo graphically displays the sequence conservation at a particular position in the alignment of sequences, measured in bits. The maximum sequence conservation per site is log2(4) bits for nucleotide sequences and log2(20) bits for amino acid sequences. If the sequence conservation value is zero or negative, no logo is displayed in that position.

seqlogo(Profile) displays a sequence logo for Profile, a sequence profile distribution matrix with the frequency of nucleotides or amino acids for every column in the multiple alignment, such as returned by the seqprofile function.

Color Code for Nucleotides

Nucleotide Color
AGreen
CBlue
GYellow
T, URed
OtherPurple

Color Code for Amino Acids

Amino Acid Chemical PropertyColor
G S T Y C Q NPolarGreen
A V L I P W F MHydrophobicOrange
D EAcidicRed
K R HBasicBlue
OtherTan

WgtMatrix = seqlogo(...) returns a cell array of unique symbols in the sequence Seqs or Profile, and the information weight matrix used to graphically display the logo.

[WgtMatrix, Handle] = seqlogo(...) returns a handle to the sequence logo figure.

seqlogo(Seqs, ...'PropertyName', PropertyValue, ...) calls seqpdist with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:


seqlogo(..., 'Displaylogo', DisplaylogoValue, ...)
controls the display of a sequence logo. Choices are true (default) or false.

seqlogo(..., 'Alphabet', AlphabetValue, ...) specifies the type of sequence (nucleotide or amino acid). Choices are 'NT' (default) or'AA'.

seqlogo(..., 'Startat', StartatValue, ...) specifies the starting position for the sequences in Seqs. Default starting position is 1.

seqlogo(..., 'Endat', EndatValue, ...) specifies the ending position for the sequences in Seqs. Default ending position is the maximum length of the sequences in Seqs.

seqlogo(..., 'SSCorrection', SSCorrectionValue, ...) controls the use of small sample correction in the estimation of the number of bits. Choices are true (default) or false.

Examples

Displaying a Sequence Logo for a Nucleotide Sequence

  1. Create a series of aligned nucleotide sequences.

    S = {'ATTATAGCAAACTA',...
         'AACATGCCAAAGTA',...
         'ATCATGCAAAAGGA'}
    
  2. Display the sequence logo.

    seqlogo(S)

  3. Notice that correction for small samples prevents you from seeing columns with information equal to log2(4) = 2 bits, but you can turn this adjustment off.

    seqlogo(S,'sscorrection',false)

Displaying a Sequence Logo for an Amino Acid Sequence

  1. Create a series of aligned amino acid sequences.

    S2 = {'LSGGQRQRVAIARALAL',... 
          'LSGGEKQRVAIARALMN',... 
          'LSGGQIQRVLLARALAA',...
          'LSGGERRRLEIACVLAL',... 
          'FSGGEKKKNELWQMLAL',... 
          'LSGGERRRLEIACVLAL'};
  2. Display the sequence logo, specifying an amino acid sequence and limiting the logo to sequence positions 2 through 10.

    seqlogo(S2, 'alphabet', 'aa', 'startAt', 2, 'endAt', 10)

References

[1] Schneider, T.D., and Stephens, R.M. (1990). Sequence Logos: A new way to display consensus sequences. Nucleic Acids Research 18, 6097–6100.

See Also

Bioinformatics Toolbox functions: seqconsensus, seqdisp, seqprofile

  


 © 1984-2008- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS