bamindexread - Read Binary Sequence Alignment/Map Index (BAI) file
Syntax
Index = bamindexread(File)
Description
Index = bamindexread(File) reads File,
a BAI file, and returns Index, a MATLAB structure
that specifies the offsets into the compressed Binary Sequence Alignment/Map
(BAM) file and decompressed data block for each reference sequence
and range of positions (bins) on each reference sequence.
Tips
The bamread function
uses the Index structure returned by bamindexread to
index into a BAM file to extract alignment records in a specified
range of a specific reference sequence. Passing the Index structure
array to the bamread function improves performance
when reading from the same BAM file multiple times, for example, when
reading different ranges of a reference sequence.
Input Arguments
File |
String specifying a file name, or a path and a file name, of
a BAM file or a BAI file. If File is a BAM file, bamindexread reads
the corresponding BAI file, that is, the BAI file with the same root
name and stored in the same folder as the BAM file. If you specify
only a file name, that file must be on the MATLAB search path
or in the Current Folder. |
Output Arguments
Index |
MATLAB array of structures that specifies the offsets into
the compressed Binary Sequence Alignment/Map (BAM) file and decompressed
data block for each reference sequence and range of positions (bins)
on the reference sequence. Index contains the
following fields.
| Field | Description |
| Filename | Name of the BAM file or BAI file used to create the Index array
of structures. |
| Index | A 1-by-N array of structures, where N is
the number of reference sequences in the corresponding BAM file. Each
structure contains the following fields: BinID — Array of bin IDs
for one reference sequence. BGZFOffsetStart — Offset
in the BAM file to the start of the first BGZF block where alignment
records associated with the corresponding BinID are
stored. BGZFOffsetEnd — Offset in
the BAM file to the start of the last BGZF block where alignment records
associated with the corresponding BinID are stored. DataOffsetStart — Offset
in the decompressed data block to the start of where alignment records
associated with the corresponding BinID are stored. DataOffsetEnd — Offset in
the decompressed data block to the end of where alignment records
associated with the corresponding BinID are stored. LinearBGZFOffset — Offset
in the BAM file to the first alignment in the corresponding 16384
bp interval. LinearDataOffset — Offset
in the decompressed data file to the first alignment in the corresponding
16384 bp interval.
|
|
Examples
Read the BAM index file associated with the ex1.bam file,
both of which are included with Bioinformatics Toolbox. Then use
the return structure to read multiple alignment records from the ex1.bam file
that align to two different reference sequences:
ind = bamindexread('ex1.bam');
data1 = bamread('ex1.bam', 'seq1', [100 200], 'index', ind);
data2 = bamread('ex1.bam', 'seq2', [100 200], 'index', ind);References
[1] Li, H., Handsaker, B., Wysoker, A., Fennell,
T., Ruan, J., Homer, N., Marth, G., Goncalo, A., and Durbin, R. (2009).
The Sequence Alignment/Map format and SAMtools. Bioinformatics 25,
16, 2078–2079.
See Also
baminfo | bamread
How To
Related Links
See how to analyze, visualize, and model biological data and systems using MathWorks products.
Get free kit