seqshowwords - Graphically display words in sequence

Syntax

seqshowwords(Seq, Word)

seqshowwords(Seq, Word, ...'Color', ColorValue, ...)
seqshowwords(Seq, Word, ...'Columns', ColumnsValue, ...)
seqshowwords(Seq, Word, ...'Alphabet', AlphabetValue, ...)

Arguments

Seq

Enter either a nucleotide or amino acid sequence. You can also enter a structure with the field Sequence.

Word

Enter a short character sequence.

ColorValue

Property to select the color for highlighted characters. Enter a 1-by-3 RGB vector specifying the intensity (0–255) of the red, green, and blue components, or enter a character from the following list: 'b'– blue, 'g'– green, 'r'– red, 'c'– cyan, 'm'– magenta, or 'y'– yellow.

The default color is red 'r'.

ColumnsValue

Property to specify the number of characters in a line. Default value is 64.

AlphabetValueProperty to select the alphabet. Enter 'AA' for amino acid sequences or 'NT' for nucleotide sequences. The default is 'NT'.

Description

seqshowwords(Seq, Word) displays the sequence with all occurrences of a word highlighted, and returns a structure with the start and stop positions for all occurrences of the word in the sequence.

seqshowwords(Seq, Word, ...'PropertyName', PropertyValue, ...) calls seqshowwords with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotes and is case insensitive. These property name/property value pairs are as follows:


seqshowwords(Seq, Word, ...'Color', ColorValue, ...)
selects the color used to highlight the words in the output display.

seqshowwords(Seq, Word, ...'Columns', ColumnsValue, ...) specifies how many columns per line to use in the output.

seqshowwords(Seq, Word, ...'Alphabet', AlphabetValue, ...) selects the alphabet for the sequence (Seq) and the word (Word).

If the search work (Word) contains nucleotide or amino acid symbols that represent multiple possible symbols, then seqshowwords shows all matches. For example, the symbol R represents either G or A (purines). If Word is 'ART', then seqshowwords shows occurrences of both 'AAT' and 'AGT'.

Examples

This example shows two matches, 'TAGT' and 'TAAT', for the word 'BART'.

seqshowwords('GCTAGTAACGTATATATAAT','BART')

ans = 
    Start: [3 17]
    Stop: [6 20]

000001 GCTAGTAACGTATATATAAT

seqshowwords does not highlight overlapping patterns multiple times. This example highlights two places, the first occurrence of 'TATA' and the 'TATATATA' immediately after 'CG'. The final 'TA' is not highlighted because the preceding 'TA' is part of an already matched pattern.

seqshowwords('GCTATAACGTATATATATA','TATA')

ans = 
    Start: [3 10 14]
    Stop: [6 13 17]

000001 GCTATAACGTATATATATA

To highlight all multiple repeats of TA, use the regular expression 'TA(TA)*TA'.

seqshowwords('GCTATAACGTATATATATA','TA(TA)*TA')

ans = 
    Start: [3 10]
    Stop: [6 19]

000001 GCTATAACGTATATATATA

See Also

Bioinformatics Toolbox™ functions: palindromes, cleave, restrict, seqdisp, seqtool, seqwordcount

MATLAB® functions: strfind, regexp

  


 © 1984-2008- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS