multialignread

Read multiple sequence alignment file

Syntax

S = multialignread(File)
[Headers, Sequences] = multialignread(File)
... = multialignread(File, 'IgnoreGaps', IgnoreGapsValue)

Input Arguments

File

Multiple sequence alignment file specified by one of the following:

  • File name or path and file name

  • URL pointing to a file

  • MATLAB® character array that contains the text of a multiple sequence alignment file

You can read common multiple sequence alignment file types, such as ClustalW (.aln), GCG (.msf), and PHYLIP.

IgnoreGapsValueControls removing gap symbols, such as '-' or '.', from the sequences. Choices are true or false (default).

Output Arguments

S

MATLAB structure array containing the following fields:

  • Header — Header information from the file.

  • Sequence — Amino acid or nucleotide sequences.

Headers

Cell array containing the header information from the file.

Sequences

Cell array containing the amino acid or nucleotide sequences.

Description

S = multialignread(File) reads a multiple sequence alignment file. The file contains multiple sequence lines that start with a sequence header followed by an optional number (not used by multialignread) and a section of the sequence. The multiple sequences are broken into blocks with the same number of blocks for every sequence. To view an example multiple sequence alignment file, type open aagag.aln at the MATLAB command line.

The output, S, is a structure array where S.Header contains the header information and S.Sequence contains the amino acid or nucleotide sequences.

[Headers, Sequences] = multialignread(File) reads the file into separate variables, Headers and Sequences, which are cell arrays containing header information and amino acid or nucleotide sequences, respectively.

... = multialignread(File, 'IgnoreGaps', IgnoreGapsValue) controls the removal of any gap symbol, such as '-' or '.', from the sequences. Choices are true or false (default).

Examples

Read a multiple sequence alignment of the gag polyprotein for several HIV strains.

gagaa = multialignread('aagag.aln')

gagaa = 

1x16 struct array with fields:
    Header
    Sequence
Was this topic helpful?