Documentation Center

  • Trial Software
  • Product Updates

seqshowwords

Graphically display words in sequence

Syntax

Struct = seqshowwords(Seq, Word)
seqshowwords(Seq, Word, ...'Color', ColorValue, ...)
seqshowwords(Seq, Word, ...'Columns', ColumnsValue, ...)
seqshowwords(Seq, Word, ...'Alphabet', AlphabetValue, ...)

Description

Struct = seqshowwords(Seq, Word) opens a separate window displaying a sequence with all occurrences of one or more words highlighted. It also returns a structure containing the start and stop positions for all occurrences of the words in the sequence.

seqshowwords(Seq, Word, ...'PropertyName', PropertyValue, ...) calls seqshowwords with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Enclose each PropertyName in single quotation marks. Each PropertyName is case insensitive. These property name/property value pairs are as follows:

seqshowwords(Seq, Word, ...'Color', ColorValue, ...) specifies the color to highlight the words in the output display of the sequence. Default is red.

seqshowwords(Seq, Word, ...'Columns', ColumnsValue, ...) specifies how many columns or characters per line in the output display of the sequence. Default is 64.

seqshowwords(Seq, Word, ...'Alphabet', AlphabetValue, ...) specifies the alphabet for the sequence and the word or words. Choices are 'AA' or 'NT' (default).

Input Arguments

Seq

Amino acid or nucleotide sequence specified by any of the following:

Word

One or more short amino acid or nucleotide sequences specified by any of the following:

    Note:   If the search word or words contain amino acid or nucleotide symbols that represent multiple symbols, then seqshowwords shows all possible matches. For example, the symbol R represents either G or A (purines). If Word is 'ART', then seqshowwords shows occurrences of both 'AAT' and 'AGT'.

    Tip   If Word contains a repeating pattern, such as 'TATA', then seqshowwords does not highlight overlapping patterns of TA in the sequence. To highlight multiple repeats of TA in a sequence, use a regular expression, such as 'TA(TA)*TA', for Word. For more information, see Examples.

ColorValue

Color to highlight all occurrences of one or more words in the sequence. Specify the color with one of the following:

  • Three-element numeric vector of RGB values

  • String containing a predefined single-letter color code

  • String containing a predefined color name

For example, to use cyan, enter [0 1 1], 'c', or 'cyan'. For more information on specifying colors, see ColorSpec.

Default: Red, which is specified by [1 0 0], 'r', or 'red'

ColumnsValue

Positive integer specifying how many columns or characters per line in the output display of the sequence.

Default: 64

AlphabetValue

String specifying the type of sequences. Choices are 'AA' or 'NT' (default).

Output Arguments

Struct

MATLAB structure containing the start and stop positions of all occurrences or the word or words in the sequence. It includes two fields.

FieldDescription
StartRow vector containing the start position of each occurrence of the search word or words.
StopRow vector containing the stop position of each occurrence of the search word or words.

Examples

Search for a word containing multiple symbols:

% Highlight the word 'BART' which represents 'TAGT' and 'TAAT'
seqshowwords('GCTAGTAACGTATATATAAT','BART')

ans = 
    Start: [3 17]
     Stop: [6 20]

 

Search for a word that repeats, excluding overlaps:

% Highlight all occurrences of 'TATA', excluding those that are  
% already part of another matched word.
seqshowwords('GCTATAACGTATATATATA','TATA')

ans = 
    Start: [3 10 14]
     Stop: [6 13 17]

 

Search for a word that repeats, including overlaps:

% Use the regular expression 'TA(TA)*TA' to highlight all multiple 
% repeats of 'TA'
seqshowwords('GCTATAACGTATATATATA','TA(TA)*TA')

ans = 
    Start: [3 10]
     Stop: [6 19]

 

Search for multiple words:

% Use a cell array as input to highlight both the words 
% 'CG' and 'GC'
seqshowwords('GCTATAACGTATATATATA',{'CG', 'GC'})
ans = 

    Start: [1 8]
     Stop: [2 9]

Alternatives

The seqviewer function opens the Biological Sequence Viewer, where you search for words in a sequence by selecting Sequence > Find Word. The Biological Sequence Viewer does not:

  • Allow searching for multiple words in one step

  • Return a structure containing the start and stop positions for all occurrences of the word in the sequence

See Also

| | | | | | | |

Tutorials

Was this topic helpful?