Products & Services Solutions Academia Support User Community Company

Learn more about Bioinformatics Toolbox   

Sequence Tool

Overview of the Sequence Tool

The Sequence Tool window integrates many of the sequence functions in the toolbox. Instead of entering commands in the MATLAB Command Window, you can select and enter options.

Importing a Sequence

The first step when analyzing a nucleotide or amino acid sequence is to import sequence information into the MATLAB environment. The Sequence Tool can connect to Web databases such as NCBI and EMBL and read information into the MATLAB environment.

The following procedure illustrates how to retrieve sequence information from the NCBI database on the Web. This example uses the GenBank accession number NM_000520, which is the human gene HEXA that is associated with Tay-Sachs disease.

  1. In the MATLAB Command Window, type

    seqtool

    The Sequence Tool window opens without a sequence loaded. Notice that the panes to the right and bottom are blank.

  2. To retrieve a sequence from the NCBI database, select File > Download Sequence from > NCBI.

    The Download Sequence From NCBI dialog box opens.

  3. In the Enter Sequence box, type an accession number for an NCBI database entry, for example, NM_000520. Select the Nucleotide option button, and then click OK.

    The MATLAB software accesses the NCBI database on the Web, loads nucleotide sequence information for the accession number you entered, and calculates some basic statistics.

Viewing Nucleotide Sequence Information

After you import a sequence into the Sequence Tool window, you can read information stored with the sequence, or you can view graphic representations for ORFs and CDSs.

  1. In the left pane tree, click Comments. The right pane displays general information about the sequence.

  2. Now click Features. The right pane displays NCBI feature information, including index numbers for a gene and any CDS sequences.

  3. Click ORF to show the search results for ORFs in the six reading frames.

  4. Click Annotated CDS to show the protein coding part of a nucleotide sequence.

Searching for Words

The following procedure illustrates how to search for characteristic words and sequence patterns. You will search for sequence patterns like the TATAA box and patterns for specific restriction enzymes.

  1. Select Sequence > Find Word.

  2. In the Find Word dialog box, type a sequence word or pattern, for example, atg, then click Find.

    The Sequence Tool window searches and displays the location of the selected word.

  3. Clear the display by clicking the Clear Word Selection button on the toolbar.

Exploring Open Reading Frames

The following procedure illustrates how to identify the protein coding part of a nucleotide sequence and copy it into a new view. Identifying coding sections of a nucleotide sequence is a common bioinformatics task. After locating the coding part of a sequence, you can copy it to a new view, translate it to an amino acid sequence, and continue with your analysis.

  1. In the left pane, click ORF.

    The Sequence Tool window displays the ORFs for the six reading frames in the lower-right pane. Hover the cursor over a frame to display information about it.

  2. Click the longest ORF on reading frame 2.

    The ORF is highlighted to indicate the part of the sequence that is selected.

  3. Right-click the selected ORF and then select Export to Workspace. In the Export to MATLAB Workspace dialog box, type a variable name, for example, NM_000520_ORF_2, then click Export.

    The NM_000520_ORF_2 variable is added to the MATLAB Workspace.

  4. Select File > Import from Workspace. Type the name of a variable with an exported ORF, for example, NM_000520_ORF_2, and then click Import.

    The Sequence Tool window adds a tab at the bottom for the new sequence while leaving the original sequence open.

  5. In the left pane, click Full Translation. Select Display > Amino Acid Residue Display > One Letter Code.

    The Sequence Tool window displays the amino acid sequence below the nucleotide sequence.

Viewing Amino Acid Sequence Statistics

The following procedure illustrates how to view an amino acid sequence for an ORF located in a nucleotide sequence. You can import your own amino acid sequence, or you can get a protein sequence from the GenBank database. This example uses the GenBank accession number NP_000511.1, which is the alpha subunit for a human enzyme associated with Tay-Sachs disease.

  1. Select File > Download Sequence from > NCBI.

    The Download Sequence From NCBI dialog box opens.

  2. In the Enter Sequence box, type an accession number for an NCBI database entry, for example, NP_000511.1. Select the Protein option button, and then click OK.

    The MATLAB software accesses the NCBI database on the Web and loads amino acid sequence information for the accession number you entered.

  3. Select Display > Amino Acid Color Scheme, and then select Charge, Function, Hydrophobicity, Structure, or Taylor. For example, select Function.

    The display colors change to highlight charge information about the amino acid residues. Color legends for the amino acid color schemes are in the following table.

Amino Acid Color SchemeColor Legend
Charge
  • Acidic — Red

  • Basic — Light Blue

  • Neutral — Black

Function
  • Acidic — Red

  • Basic — Light Blue

  • Hydropobic, nonpolar — Black

  • Polar, uncharged — Green

Hydrophobicity
  • Hydrophilic — Light Blue

  • Hydrophobic — Black

Structure
  • Ambivalent — Dark Green

  • External — Light Blue

  • Internal — Orange

TaylorEach amino acid is assigned its own color, based on the colors proposed by W.R. Taylor.

Closing the Sequence Tool

You can close the Sequence Tool window from the MATLAB command line by using the following syntax:

seqtool('close')

References

[1] Taylor, W.R. (1997). Reisdual colours: a proposal for aminochromography. Protein Engineering 10, 7, 743–746.

  


Recommended Products

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.

 © 1984-2009- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS