Products & Services Solutions Academia Support User Community Company

Learn more about Bioinformatics Toolbox   

seqshowwords - Graphically display words in sequence

Syntax

Struct = seqshowwords(Seq, Word)
seqshowwords(Seq, Word, ...'Color', ColorValue, ...)
seqshowwords(Seq, Word, ...'Columns', ColumnsValue, ...)
seqshowwords(Seq, Word, ...'Alphabet', AlphabetValue, ...)

Description

Struct = seqshowwords(Seq, Word) opens a separate window displaying a sequence with all occurrences of one or more words highlighted. It also returns a structure containing the start and stop positions for all occurrences of the words in the sequence.

seqshowwords(Seq, Word, ...'PropertyName', PropertyValue, ...) calls seqshowwords with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Enclose each PropertyName in single quotation marks. Each PropertyName is case insensitive. These property name/property value pairs are as follows:

seqshowwords(Seq, Word, ...'Color', ColorValue, ...) specifies the color to highlight the words in the output display of the sequence. Default is red.

seqshowwords(Seq, Word, ...'Columns', ColumnsValue, ...) specifies how many columns or characters per line in the output display of the sequence. Default is 64.

seqshowwords(Seq, Word, ...'Alphabet', AlphabetValue, ...) specifies the alphabet for the sequence and the word or words. Choices are 'AA' or 'NT' (default).

Inputs

Seq

Amino acid or nucleotide sequence specified by any of the following:

Word

One or more short amino acid or nucleotide sequences specified by any of the following:

    Note   If the search word or words contain amino acid or nucleotide symbols that represent multiple symbols, then seqshowwords shows all possible matches. For example, the symbol R represents either G or A (purines). If Word is 'ART', then seqshowwords shows occurrences of both 'AAT' and 'AGT'.

    Tip   If Word contains a repeating pattern, such as 'TATA', then seqshowwords does not highlight overlapping patterns of TA in the sequence. To highlight multiple repeats of TA in a sequence, use a regular expression, such as 'TA(TA)*TA', for Word. For more information, see Examples.

ColorValue

Color to highlight all occurrences of one or more words in the sequence. Specify the color with one of the following:

  • Three-element numeric vector of RGB values

  • String containing a predefined single-letter color code

  • String containing a predefined color name

For example, to use cyan, enter [0 1 1], 'c', or 'cyan'. For more information on specifying colors, see ColorSpec.

Default: Red, which is specified by [1 0 0], 'r', or 'red'

ColumnsValue

Positive integer specifying how many columns or characters per line in the output display of the sequence.

Default: 64

AlphabetValue

String specifying the type of sequences. Choices are 'AA' or 'NT' (default).

Outputs

Struct

MATLAB structure containing the start and stop positions of all occurrences or the word or words in the sequence. It includes two fields.

FieldDescription
StartRow vector containing the start position of each occurrence of the search word or words.
StopRow vector containing the stop position of each occurrence of the search word or words.

Examples

Search for a word containing multiple symbols:

% Highlight the word 'BART' which represents 'TAGT' and 'TAAT'
seqshowwords('GCTAGTAACGTATATATAAT','BART')

ans = 
    Start: [3 17]
    Stop: [6 20]

000001 GCTAGTAACGTATATATAAT
 

Search for a word that repeats, excluding overlaps:

% Highlight all occurrences of 'TATA', excluding those that are  
% already part of another matched word.
seqshowwords('GCTATAACGTATATATATA','TATA')

ans = 
    Start: [3 10 14]
    Stop: [6 13 17]

000001 GCTATAACGTATATATATA
 

Search for a word that repeats, including overlaps:

% Use the regular expression 'TA(TA)*TA' to highlight all multiple 
% repeats of 'TA'
seqshowwords('GCTATAACGTATATATATA','TA(TA)*TA')

ans = 
    Start: [3 10]
    Stop: [6 19]

000001 GCTATAACGTATATATATA
 

Search for multiple words:

% Use a cell array as input to highlight both the words 
% 'CG' and 'GC'
seqshowwords('GCTATAACGTATATATATA',{'CG', 'GC'})
ans = 

    Start: [1 8]
     Stop: [2 9]

000001   GCTATAACGTATATATATA

Alternatives

The seqtool function opens the Sequence Tool window, where you search for words in a sequence by selecting Sequence > Find Word. The Sequence Tool does not:

See Also

cleave | ColorSpec | palindromes | regexp | restrict | seqdisp | seqtool | seqwordcount | strfind

Tutorials

How To

  


Recommended Products

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.

 © 1984-2009- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS