| Contents | Index |
Profile = seqprofile(Seqs)
[Profile, Symbols]
= seqprofile(Seqs)
seqprofile(Seqs,
...'Alphabet', AlphabetValue, ...)
seqprofile(Seqs,
...'Counts', CountsValue, ...)
seqprofile(Seqs,
...'Gaps', GapsValue, ...)
seqprofile(Seqs,
...'Ambiguous', AmbiguousValue, ...)
seqprofile(Seqs,
...'Limits', LimitsValue, ...)
| Seqs | Set of multiply aligned sequences represented by any of the following:.
|
| AlphabetValue | String specifying the sequence alphabet. Choices are:
When Alphabet is 'none', the symbol list is based on the observed symbols. Each character can be any symbol, except for a hyphen (-) and a period (.), which are reserved for gaps. |
| CountsValue | Controls returning frequency (ratio of counts/total counts) or counts. Choices are true (counts) or false (frequency). Default is false. |
| GapsValue | String that controls the counting of gaps in a sequence. Choices are:
|
| AmbiguousValue | Controls counting ambiguous symbols. Enter 'Count' to add partial counts to the standard symbols. |
| LimitsValue | Specifies whether to use part of the sequence. Enter a [1x2] vector with the first position and the last position to include in the profile. Default is [1,SeqLength]. |
Profile = seqprofile(Seqs) returns Profile, a matrix of size [20 (or 4) x SequenceLength] with the frequency of amino acids (or nucleotides) for every column in the multiple alignment. The order of the rows is given by
4 nucleotides — A C G T/U
20 amino acids — A R N D C Q E G H I L K M F P S T W Y V
[Profile, Symbols] = seqprofile(Seqs) returns Symbols, a unique symbol list where every symbol in the list corresponds to a row in Profile, the profile.
seqprofile(Seqs, ...'PropertyName', PropertyValue, ...) calls seqprofile with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:
seqprofile(Seqs,
...'Alphabet', AlphabetValue, ...) selects
a nucleotide alphabet, amino acid alphabet, or no alphabet.
seqprofile(Seqs, ...'Counts', CountsValue, ...) when Counts is true, returns the counts instead of the frequency.
seqprofile(Seqs, ...'Gaps', GapsValue, ...) appends a row to the bottom of a profile (Profile) with the count for gaps.
seqprofile(Seqs, ...'Ambiguous', AmbiguousValue, ...) when Ambiguous is 'count', counts the ambiguous amino acid symbols (B Z X) and nucleotide symbols (R Y K M S W B D H V N) with the standard symbols. For example, the amino acid X adds a 1/20 count to every row while the amino acid B counts as 1/2 at the D and N rows.
seqprofile(Seqs, ...'Limits', LimitsValue, ...) specifies the start and end positions for the profile relative to the indices of the multiple alignment.
Calculate the sequence profile from set of multiply aligned sequences.
Create an array of structures representing a multiple alignment of amino acids:
seqs = fastaread('pf00002.fa');Return the sequence profile and symbol list from the set of multiply aligned sequences:
[Profile1,Symbols1] = seqprofile(seqs);
Calculate the sequence profile from set of multiply aligned sequences. Specify only part of the alignment and to count all gaps.
Create an array of structures representing a multiple alignment of amino acids:
seqs = fastaread('pf00002.fa');Return the sequence profile and symbol list from position 50 through 55 of the set of multiply aligned sequences, counting all gaps:
[Profile2,Symbols2] = seqprofile(seqs,'limits',[50 55],'gaps','all')
Profile2 =
0.0313 0.0313 0.1563 0.4375 0.1250 0.2188
0 0 0.3750 0 0 0
0 0 0.0938 0.1563 0 0
0 0 0 0.0313 0 0
0 0.0625 0 0 0.0313 0
0 0 0 0.0313 0 0
0 0 0 0.1250 0 0
0.0313 0 0.0625 0 0 0
0 0 0 0 0 0
0.4688 0.0625 0 0 0.3125 0.1563
0.1250 0.6250 0.0313 0 0.2188 0.1875
0 0 0.1250 0.0313 0 0
0.1250 0.0625 0 0 0 0.0313
0.1563 0.0313 0 0 0.0313 0
0 0 0 0 0 0
0 0.0313 0.1250 0.1250 0.0625 0.2500
0 0 0 0 0.1563 0.0938
0 0 0 0 0 0
0 0 0 0 0 0
0.0625 0.0938 0.0313 0.0625 0.0625 0.0625
0 0 0 0 0 0
Symbols2 =
ARNDCQEGHILKMFPSTWYV-fastaread | multialignread | multialignwrite | seqconsensus | seqdisp | seqlogo

See how to analyze, visualize, and model biological data and systems using MathWorks products.
Get free kit| © 1984-2012- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |