dndsml

Estimate synonymous and nonsynonymous substitution rates using maximum likelihood method

Syntax

[Dn, Ds, Like] = dndsml(SeqNT1, SeqNT2)

[Dn, Ds, Like] = dndsml(SeqNT1, SeqNT2, ...'GeneticCode', GeneticCodeValue, ...)
[Dn, Ds, Like] = dndsml(SeqNT1, SeqNT2, ...'Verbose', VerboseValue, ...)

Arguments

SeqNT1, SeqNT2Nucleotide sequences. Enter either a string or a structure with the field Sequence.
GeneticCodeValueProperty to specify a genetic code. Enter a Code Number or a string with a Code Name from the table Genetic Code. If you use a Code Name, you can truncate it to the first two characters. Default is 1 or Standard.
VerboseValueProperty to control the display of the codons considered in the computations and their amino acid translations. Choices are true or false (default).

    Tip   Specify true to use this display to manually verify the codon alignment of the two input sequences. The presence of stop codons (*) in the amino acid translation can indicate that SeqNT1 and SeqNT2 are not codon-aligned.

Return Values

Dn Nonsynonymous substitution rate(s).
DsSynonymous substitution rate(s).
LikeLikelihood of estimate of substitution rates.

Description

[Dn, Ds, Like] = dndsml(SeqNT1, SeqNT2) estimates the synonymous and nonsynonymous substitution rates between the two homologous sequences, SeqNT1 and SeqNT2, using the Goldman-Yang method (1994). This maximum likelihood method estimates an explicit model for codon substitution that accounts for transition/transversion rate bias and base/codon frequency bias. Then it uses the model to correct synonymous and nonsynonymous counts to account for multiple substitutions at the same site. The maximum likelihood method is best suited when the sample size is significant (larger than 100 bases) and when the sequences being compared can have transition/transversion rate biases and base/codon frequency biases.

dndsml returns:

This analysis:

[Dn, Ds, Like] = dndsml(SeqNT1, SeqNT2, ...'PropertyName', PropertyValue, ...) calls dnds with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:


[Dn, Ds, Like] = dndsml(SeqNT1, SeqNT2, ...'GeneticCode', GeneticCodeValue, ...)
calculates synonymous and nonsynonymous substitution rates using the specified genetic code. Enter a Code Number or a string with a Code Name from the table Genetic Code. If you use a Code Name, you can truncate it to the first two characters. Default is 1 or Standard.

[Dn, Ds, Like] = dndsml(SeqNT1, SeqNT2, ...'Verbose', VerboseValue, ...) controls the display of the codons considered in the computations and their amino acid translations. Choices are true or false (default).

Examples

Estimating Synonymous and Nonsynonymous Substitution Rates Between the gag Genes of Two HIV Viruses

  1. Retrieve two sequences from the GenBank® database for the gag genes of two HIV viruses

    gag1 = getgenbank('L11768');
    gag2 = getgenbank('L11770');
  2. Estimate the synonymous and nonsynonymous substitution rates between the two sequences.

    [dn ds like] = dndsml(gag1, gag2)
    
    dn =
        0.0259
    ds =
        0.0623
    like =
       -2.1857e+003
    

Estimating Synonymous and Nonsynonymous Substitution Rates Between Two Nucleotide Sequences That Are Not Codon-Aligned

  1. Retrieve two nucleotide sequences from the GenBank database for the neuraminidase (NA) protein of two strains of the Influenza A virus (H5N1).

     hk01 = getgenbank('AF509094');
     vt04 = getgenbank('DQ094287');
    
  2. Extract the coding region from the two nucleotide sequences.

    hk01_cds = featuresparse(hk01,'feature','CDS','Sequence',true);
    vt04_cds = featuresparse(vt04,'feature','CDS','Sequence',true);
    
  3. Align the amino acids sequences converted from the nucleotide sequences.

     [sc,al]=nwalign(nt2aa(hk01_cds),nt2aa(vt04_cds),'extendgap',1);
    
  4. Use the seqinsertgaps function to copy the gaps from the aligned amino acid sequences to their corresponding nucleotide sequences, thus codon-aligning them.

     hk01_aligned = seqinsertgaps(hk01_cds,al(1,:));
     vt04_aligned = seqinsertgaps(vt04_cds,al(3,:));
    
  5. Estimate the synonymous and nonsynonymous substitutions rates of the codon-aligned nucleotide sequences and also display the codons considered in the computations and their amino acid translations.

    [dn,ds] = dndsml(hk01_aligned,vt04_aligned,'verbose',true)
    
    dn =
        0.0445
    
    ds =
        0.1576

References

[1] Tamura, K., and Mei, M. (1993). Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Molecular Biology and Evolution 10, 512–526.

[2] Yang, Z., and Nielsen, R. (2000). Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Molecular Biology and Evolution 17, 32–43.

[3] Goldman, N., and Yang, Z. (1994). A Codon-based Model of Nucleotide Substitution for Protein-coding DNA Sequences. Mol. Biol. Evol. 11(5), 725–736.

See Also

Bioinformatics Toolbox™ functions: dnds, featuresparse, geneticcode, nt2aa, nwalign, seqinsertgaps, seqpdist

  


 © 1984-2008- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS