Documentation

This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.

basecount

Count nucleotides in sequence

Syntax

NTStruct = basecount(SeqNT)
NTStruct = basecount(SeqNT, ...'Ambiguous', AmbiguousValue, ...)
NTStruct = basecount(SeqNT, ...'Gaps', GapsValue, ...)
NTStruct = basecount(SeqNT, ...'Chart', ChartValue, ...)

Input Arguments

SeqNT

One of the following:

AmbiguousValue

Character vector specifying how to treat ambiguous nucleotide characters (R, Y, K, M, S, W, B, D, H, V, or N). Choices are:

  • 'ignore' (default) — Skips ambiguous characters

  • 'bundle' — Counts ambiguous characters and reports the total count in the Ambiguous field.

  • 'prorate' — Counts ambiguous characters and distributes them proportionately in the appropriate fields. For example, the counts for the character R are distributed evenly between the A and G fields.

  • 'individual' — Counts ambiguous characters and reports them in individual fields.

  • 'warn' — Skips ambiguous characters and displays a warning.

GapsValue

Specifies whether gaps, indicated by a hyphen (-), are counted or ignored. Choices are true or false (default).

ChartValue

Character vector specifying a chart type. Choices are 'pie' or 'bar'.

Output Arguments

NTStruct1-by-1 MATLAB structure containing the fields A, C, G, and T.

Description

NTStruct = basecount(SeqNT) counts the number of each type of base in SeqNT, a nucleotide sequence, and returns the counts in NTStruct, a 1-by-1 MATLAB structure containing the fields A, C, G, and T.

  • The character U is added to the T field.

  • Ambiguous nucleotide characters (R, Y, K, M, S, W, B, D, H, V, or N), and gaps, indicated by a hyphen (-), are ignored by default.

  • Unrecognized characters are ignored and cause the following warning message.

    Warning: Unknown symbols appear in the sequence. These will be ignored.

NTStruct = basecount(SeqNT, ...'PropertyName', PropertyValue, ...) calls basecount with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:

NTStruct = basecount(SeqNT, ...'Ambiguous', AmbiguousValue, ...) specifies how to treat ambiguous nucleotide characters (R, Y, K, M, S, W, B, D, H, V, or N). Choices are:

  • 'ignore' (default)

  • 'bundle'

  • 'prorate'

  • 'individual'

  • 'warn'

NTStruct = basecount(SeqNT, ...'Gaps', GapsValue, ...) specifies whether gaps, indicated by a hyphen (-), are counted or ignored. Choices are true or false (default).

NTStruct = basecount(SeqNT, ...'Chart', ChartValue, ...) creates a chart showing the relative proportions of the nucleotides. ChartValue can be 'pie' or 'bar'.

Examples

collapse all

Count the bases in a DNA sequence and return the results in a structure.

bases = basecount('TAGCTGGCCAAGCGAGCTTG')
bases = 

  struct with fields:

    A: 4
    C: 5
    G: 7
    T: 4

Get the count for adenosine (A) bases.

bases.A
ans =

     4

Count the bases in a DNA sequence containing ambiguous characters (R, Y, K, M, S, W, B, D, H, V, or N), listing each of them in a separate field.

basecount('ABCDGGCCAAGCGAGCTTG','Ambiguous','individual')
ans = 

  struct with fields:

    A: 4
    C: 5
    G: 6
    T: 2
    R: 0
    Y: 0
    K: 0
    M: 0
    S: 0
    W: 0
    B: 1
    D: 1
    H: 0
    V: 0
    N: 0

Introduced before R2006a

Was this topic helpful?