How to obtain InsertSize from a BioMap object?

5 views (last 30 days)
Razvan
Razvan on 26 Jan 2015
Edited: Razvan on 7 Feb 2015
Is there a way of obtaining the InsertSize field (available using bamread) when opening a bam file as a BioMap object? Opening the bam files using a BioMap object is much faster compared to the bamread function, but I can't find any way of obtaining the InsertSize (TLEN from the corresponding sam file)...

Answers (2)

Luuk van Oosten
Luuk van Oosten on 26 Jan 2015
If you use something like:
BAMStruct = bamread(File,RefSeq,Range)
You get a struct. from this you can obtain the InsertSize by using something like:
x = BAMStruct.InsertSize
In the examples of the help doc (which are here) you will find something similar; they show an example how they extract instead of the 'InsertSize' the 'Position' by using the following:
data_one = bamread('ex1.bam','seq1', [100 300]);
data_one(27).Position
Good luck!
  3 Comments
Luuk van Oosten
Luuk van Oosten on 28 Jan 2015
Can you provide a .bam file for me and you code? I'll give it a try!
Razvan
Razvan on 31 Jan 2015
Edited: Razvan on 7 Feb 2015
Thanks for spending your time on this! Here is a BAM file and some Matlab code that illustrates my problem: https://app.box.com/s/e3sjphe802z616gzwzelse0o89py6vlm
If you run my script from that folder, you'll see the 2 ways that I know to open a bam file and to compute the fragment lengths. Using bamread, the TLEN field from the SAM/BAM file is easily accessible. Using BioMap objects, the TLEN/InsertSize information is not available anymore and one needs to match the headers of the reads in order to compute to total fragment length. The problem is that bamread is very slow in comparison with BioMap (3s vs. 0.2s), and sorting the headers of millions of reads is also pretty slow. It would be much better if BioMap would simply load the TLEN field of the SAM/BAM files. This is available information stored in the SAM/BAM file which is simply disregarded by BioMap (as far as I know). Any chance of getting this TLEN info using the faster BioMap loading function?

Sign in to comment.


Razvan
Razvan on 5 Feb 2015
Any other suggestions?

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!