Skip to Main Content Skip to Search
Product Documentation

bowtieread - Read data from Bowtie file

Syntax

BWTStruct = bowtieread(File)
BWTStruct = bowtieread(File,Name,Value)

Description

BWTStruct = bowtieread(File) reads File, a Bowtie-formatted file (version 0.12.3) and returns the data in BWTStruct, a MATLAB array of structures.

BWTStruct = bowtieread(File,Name,Value) reads a Bowtie-formatted file with additional options specified by one or more Name,Value pair arguments.

Tips

If your Bowtie-formatted file is too large to read using available memory, try either of the following:

Input Arguments

File

Either of the following:

  • String specifying a file name or path and file name of a Bowtie-formatted file. If you specify only a file name, that file must be on the MATLAB search path or in the Current Folder.

  • MATLAB string containing the text of a Bowtie-formatted file.

The bowtieread function reads Bowtie-formatted files version 0.12.3.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments, where Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

'BlockRead'

Scalar or vector that controls the reading of a single sequence entry or block of sequence entries from a Bowtie-formatted file containing multiple sequences. Enter a scalar N, to read the Nth entry in the file. Enter a 1-by-2 vector [M1, M2], to read a block of entries starting at the M1 entry and ending at the M2 entry. To read all remaining entries in the file starting at the M1 entry, enter a positive value for M1 and enter Inf for M2.

'ZeroBased'

Logical specifying whether bowtieread uses zero-based indexing when reading a file. The logical controls the return of zero-based or one-based positions in the Position field in BWTStruct. Choices are true or false (default), which returns one-based positions.

Default: false

'AlignDetails'

Logical specifying whether or not to include the AlignDetails field in the BWTStruct output argument. The AlignDetails field includes information on mismatch descriptors. Choices are true (default) or false.

Default: true

Output Arguments

BWTStruct

An N-by-1 array of structures containing sequence alignment and mapping information from a Bowtie-formatted file, where N is the number of alignment records stored in the Bowtie-formatted file. Each structure contains the following fields.

FieldDescription
QueryName

Name of the aligned read sequence.

Strand+ or − specifying direction (forward or reverse) of the reference sequence to which the read sequence aligns.
ReferenceNameName or numeric ID of the reference sequence to which the read sequence aligns.
PositionPosition of the forward reference sequence where the leftmost base of the alignment of the read sequence starts. This position is zero-based or one-based, depending on the ZeroBased name-value pair argument.
SequenceString containing the letter representations of the read sequence. It is the reverse complement if the read sequence aligns to the reverse strand of the reference sequence.
QualityString containing the ASCII representation of the per-base quality score for the read sequence. The quality score is reversed if the read sequence aligns to the reverse strand of the reference sequence.
NumHitsThe number of other instances where this read sequence aligns to an identical length of bases on another area of the reference sequence.
AlignDetailsInformation on mismatches, insertions, and deletions in the alignment.

Examples

Read the alignment records (entries) from the sample01.bowtie file into a MATLAB array of structures and access some of the data:

% Read the alignment records stored in sample01.bowtie
data = bowtieread('sample01.bowtie')
data = 

17x1 struct array with fields:
    QueryName
    Strand
    ReferenceName
    Position
    Sequence
    Quality
    NumHits
    AlignDetails
% Access the quality score for the 6th entry
data(6).Quality
ans =

>>>><>>>>>>>>>6>>>8>8<>/>58<:>66-(6
% Determine the strand direction (forward or reverse) of the reference
% sequence to which the 14th entry aligns
data(14).Strand
ans =

+
 

Read a block of alignment records (entries) from the sample01.bowtie file into a MATLAB array of structures:

% Read a block of six entries from a Bowtie file
data_5_10 = bowtieread('sample01.bowtie','blockread', [5 10])
data_5_10 = 

6x1 struct array with fields:
    QueryName
    Strand
    ReferenceName
    Position
    Sequence
    Quality
    NumHits
    AlignDetails

References

[1] Langmead, B., Trapnell, C., Pop, M., and Salzberg, S. (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, 3, 212.

See Also

bamread | fastqread | samread | soapread

How To

Related Links

  


Free Computational Biology Interactive Kit

See how to analyze, visualize, and model biological data and systems using MathWorks products.

Get free kit

Trials Available

Try the latest computational biology products.

Get trial software
 © 1984-2012- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS