Skip to Main Content Skip to Search
Product Documentation

getCompactAlignment - Class: BioMap

Construct compact alignment represented in BioMap object

Syntax

CompAlignment = getCompactAlignment(BioObj, StartPos, EndPos)
[CompAlignment, Indices] = getCompactAlignment(BioObj, StartPos, EndPos)
[CompAlignment, Indices, Rows] = getCompactAlignment(BioObj, StartPos, EndPos)
... = getCompactAlignment(BioObj, StartPos, EndPos, 'ParameterName', ParameterValue)

Description

CompAlignment = getCompactAlignment(BioObj, StartPos, EndPos) returns CompAlignment, a character array containing the aligned read sequences from BioObj, a BioMap object, in a compact format. The read sequences must align within a specific region of the reference sequence, which is defined by StartPos and EndPos, two positive integers such that StartPos is less than EndPos, and both are smaller than the length of the reference sequence.

[CompAlignment, Indices] = getCompactAlignment(BioObj, StartPos, EndPos) returns Indices, a vector of indices specifying the read sequences that align within a specific region of the reference sequence.

[CompAlignment, Indices, Rows] = getCompactAlignment(BioObj, StartPos, EndPos) returns Rows, a vector of positive numbers specifying the row in CompAlignment where each read sequence is best displayed.

... = getCompactAlignment(BioObj, StartPos, EndPos, 'ParameterName', ParameterValue) accepts one or more comma-separated parameter name/value pairs. Specify ParameterName inside single quotes.

Input Arguments

BioObj

Object of the BioMap class.

StartPos

Positive integer that defines the start of a region of the reference sequence. StartPos must be less than EndPos, and smaller than the total length of the reference sequence.

EndPos

Positive integer that defines the end of a region of the reference sequence. EndPos must be greater than StartPos, and smaller than the total length of the reference sequence.

Parameter Name/Value Pairs

'Full'

Specifies whether or not to include only the read sequences that fully align with the defined region of the reference sequence, that is, they are completely contained within the region, and do not extend beyond the region. Choices are true or false (default).

Default: false

'TrimAlignment'

Specifies whether or not to trim empty leading and trailing columns from the alignment. Choices are true or false. Default is false, which does not trim the alignment, but includes any empty leading or trailing columns, and returns an alignment always of length EndPosStartPos + 1.

Default: false

Output Arguments

CompAlignment

Character array containing the aligned read sequences from BioObj that align within a specific region of the reference sequence. The character array represents a compact alignment, that is each row of the character array contains one or more aligned sequences, such that the number of rows in the character array is minimized. Each aligned sequence includes only the sequence positions that fall within the specified region of the reference sequence, and each aligned sequence can include gaps.

Indices

Vector of indices specifying the read sequences from BioObj that align within a specific region of the reference sequence.

Rows

Vector of positive numbers specifying the row in CompAlignment where each read sequence is best displayed.

Examples

Construct a BioMap object, and then construct the compact alignment between positions 30 and 59 of the reference sequence:

% Construct a BioMap object from a SAM file 
BMObj1 = BioMap('ex1.sam');
% Construct the compact alignment between positions 30 and 59 of
% the reference sequence, and return the indices of the reads in the
% compact alignment, as well as the row each read is in. 
[CompAlignment, Ind, Row] = getCompactAlignment(BMObj1, 30, 59)
CompAlignment =

TAACTCG      GCCCAGCATTAGGGAGC
TAACTCGT           CATTAGGGAGC
TAACTCGTCC          ATTAGGGAGC
TAACTCTTCTCT         TTAGGGAGC
TAACTCGTCCATGG        TAGGGAGC
TAACTCGTCCCTGGCCCA           C
TAACTCGTCCATGGCCCAG           
TAACTCGTCCATTGCCCAGC          
TAACTCGTCCATGGCCCAGCATT       
TAACTCGTCCATGGCCCAGCATTTGGG   
TAACTCGTCCATGGCCCAGCATTAGGG   
TAACTCGTCCATGGCCCAGCATTAGGGAGC
TAACTCGTCCATGGCCCAGCATTAGGGATC
TAACTCGTCCATGGCCCAGCATTAGGGAGC
 AACTCGTCCATGGCCCAGCATTAGGGAGC
      GTACATGGCCCAGCATTAGGGAGC
       TCCATGGCCCAGCATTAGGGCGC


Ind =

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23


Row =

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
     1
     2
     3
     4
     5
     6

Algorithms

getCompactAlignment assumes the reference sequence has no gaps. Therefore, positions in reads corresponding to insertions (I) and padding (P) do not appear in the alignment.

Because soft clipped positions (S) are not associated with positions that align to the reference sequence, they do not appear in the alignment.

A skipped position (N) appears as a - (hyphen) in the alignment.

Hard clipped positions (H) do not appear in the sequences or the alignment.

See Also

align2cigar | BioMap | cigar2align | getAlignment | getBaseCoverage

How To

Related Links

  


Free Computational Biology Interactive Kit

See how to analyze, visualize, and model biological data and systems using MathWorks products.

Get free kit

Trials Available

Try the latest computational biology products.

Get trial software
 © 1984-2012- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS