bowtieread - Read data from Bowtie file
Syntax
BWTStruct = bowtieread(File)
BWTStruct = bowtieread(File,Name,Value)
Description
BWTStruct = bowtieread(File) reads File,
a Bowtie-formatted file (version 0.12.3) and returns the data in BWTStruct,
a MATLAB array of structures.
BWTStruct = bowtieread(File,Name,Value) reads
a Bowtie-formatted file with additional options specified by one or
more Name,Value pair arguments.
Tips
If your Bowtie-formatted file is too large to read using available
memory, try either of the following:
Use the BlockRead name-value pair
arguments to read a subset of entries.
Create a BioIndexedFile object
from the Bowtie-formatted file (using 'TABLE' for
the Format), and then access the entries
using methods of the BioIndexedFile class.
Input Arguments
File |
Either of the following:
String specifying a file name or path and file name
of a Bowtie-formatted file. If you specify only a file name, that
file must be on the MATLAB search path or in the Current Folder. MATLAB string containing the text of a Bowtie-formatted
file.
The bowtieread function reads Bowtie-formatted
files version 0.12.3. |
Name-Value Pair Arguments
Specify optional comma-separated pairs of Name,Value arguments,
where Name is the argument
name and Value is the corresponding
value. Name must appear
inside single quotes (' ').
You can specify several name and value pair
arguments in any order as Name1,Value1,...,NameN,ValueN.
'BlockRead' |
Scalar or vector that controls the reading of a single sequence
entry or block of sequence entries from a Bowtie-formatted file containing
multiple sequences. Enter a scalar N, to
read the Nth entry in the file. Enter a
1-by-2 vector [M1, M2], to read a block
of entries starting at the M1 entry and
ending at the M2 entry. To read all remaining
entries in the file starting at the M1 entry,
enter a positive value for M1 and enter Inf for M2.
|
'ZeroBased' |
Logical specifying whether bowtieread uses
zero-based indexing when reading a file. The logical controls the
return of zero-based or one-based positions in the Position field
in BWTStruct. Choices are true or false (default),
which returns one-based positions.
Default: false |
'AlignDetails' |
Logical specifying whether or not to include the AlignDetails field
in the BWTStruct output argument. The AlignDetails field
includes information on mismatch descriptors. Choices are true (default)
or false. Default: true |
Output Arguments
BWTStruct |
An N-by-1 array of structures containing
sequence alignment and mapping information from a Bowtie-formatted
file, where N is the number of alignment records
stored in the Bowtie-formatted file. Each structure contains the following
fields.
| Field | Description |
| QueryName | Name of the aligned read sequence. |
| Strand | + or − specifying direction (forward or reverse) of
the reference sequence to which the read sequence aligns. |
| ReferenceName | Name or numeric ID of the reference sequence to which the read
sequence aligns. |
| Position | Position of the forward reference sequence where the leftmost
base of the alignment of the read sequence starts. This position is
zero-based or one-based, depending on the ZeroBased name-value
pair argument. |
| Sequence | String containing the letter representations of the read sequence.
It is the reverse complement if the read sequence aligns to the reverse
strand of the reference sequence. |
| Quality | String containing the ASCII representation of the per-base
quality score for the read sequence. The quality score is reversed
if the read sequence aligns to the reverse strand of the reference
sequence. |
| NumHits | The number of other instances where this
read sequence aligns to an identical length of bases on another area
of the reference sequence. |
| AlignDetails | Information on mismatches, insertions, and deletions in the
alignment. |
|
Examples
Read the alignment records (entries) from the sample01.bowtie file
into a MATLAB array of structures and access some of the data:
% Read the alignment records stored in sample01.bowtie
data = bowtieread('sample01.bowtie')data =
17x1 struct array with fields:
QueryName
Strand
ReferenceName
Position
Sequence
Quality
NumHits
AlignDetails% Access the quality score for the 6th entry
data(6).Quality
ans =
>>>><>>>>>>>>>6>>>8>8<>/>58<:>66-(6
% Determine the strand direction (forward or reverse) of the reference
% sequence to which the 14th entry aligns
data(14).Strand
ans =
+
Read a block of alignment records (entries) from the sample01.bowtie file
into a MATLAB array of structures:
% Read a block of six entries from a Bowtie file
data_5_10 = bowtieread('sample01.bowtie','blockread', [5 10])data_5_10 =
6x1 struct array with fields:
QueryName
Strand
ReferenceName
Position
Sequence
Quality
NumHits
AlignDetailsReferences
[1] Langmead, B., Trapnell, C., Pop, M., and
Salzberg, S. (2009). Ultrafast and memory-efficient alignment of short
DNA sequences to the human genome. Genome Biol. 10, 3,
212.
See Also
bamread | fastqread | samread | soapread
How To
Related Links
See how to analyze, visualize, and model biological data and systems using MathWorks products.
Get free kit