This is machine translation

Translated by Microsoft
Mouseover text to see original. Click the button below to return to the English verison of the page.

Note: This page has been translated by MathWorks. Please click here
To view all translated materals including this page, select Japan from the country navigator on the bottom of this page.


Return information about BAM file


InfoStruct = baminfo(File)
InfoStruct = baminfo(File,Name,Value)


InfoStruct = baminfo(File) returns a MATLAB® structure containing summary information about a BAM-formatted file.

InfoStruct = baminfo(File,Name,Value) returns a MATLAB structure with additional options specified by one or more Name,Value pair arguments.

Input Arguments


Character vector specifying a file name or path and file name of a BAM-formatted file. If you specify only a file name, that file must be on the MATLAB search path or in the Current Folder.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.


Logical that controls the scanning of the BAM-formatted file to determine the reference names and the number of reads aligned to each reference. If true, the ScannedDictionary and ScannedDictionaryCount fields contain this information.

Default: false


Logical that controls the scanning of a BAM-formatted file to determine the number of alignment records in the file. If true, the NumReads field contains this information.

Default: false

Output Arguments


MATLAB structure containing summary information about a BAM-formatted file. The structure contains these fields.

FilenameName of the BAM-formatted file.
FilePathPath to the file.
FileSizeSize of the file in bytes.
FileModDateModification date of the file.
Header**Structure containing the file format version, sort order, and group order.

Structure containing the:

  • Read group identifier

  • Sample

  • Library

  • Description

  • Platform unit

  • Predicted median insert size

  • Sequencing center

  • Date

  • Platform


Structure containing the:

  • Sequence name

  • Sequence length

  • Genome assembly identifier

  • MD5 checksum of sequence

  • URI of sequence

  • Species


Structure containing the:

  • Program name

  • Version

  • Command line

NumReadsNumber of reference sequences in the BAM-formatted file.
ScannedDictionary*Cell array of character vectors specifying the names of the reference sequences in the BAM-formatted file.
ScannedDictionaryCount*Cell array specifying the number of reads aligned to each reference sequence.

* — The ScannedDictionary and ScannedDictionaryCount fields are empty if you do not set the ScanDictionary name-value pair argument to true.

** — These structures and their fields appear in the output structure only if they are in the BAM file. The information in these structures depends on the information in the BAM file.


collapse all

This example shows how to retrieve information about the ex1.bam file included with the Bioinformatics Toolbox™.

info = baminfo('ex1.bam','ScanDictionary',true,'numofreads',true)
info = 

  struct with fields:

                  Filename: 'ex1.bam'
                  FilePath: '/mathworks/devel/bat/Bdoc17b/build/matlab/toolbox/bioinfo/bioinfodata'
                  FileSize: 126692
               FileModDate: '07-May-2010 16:12:04'
                    Header: [1x1 struct]
                 ReadGroup: [1x2 struct]
        SequenceDictionary: [1x2 struct]
                  NumReads: 3307
         ScannedDictionary: {2x1 cell}
    ScannedDictionaryCount: [2x1 uint64]

List the number of references found in the BAM file.

ans =


Alternatively, you can use the available header information from a BAM file to find out the number of references, thus avoiding the whole traversal of the source file.

info = baminfo('ex1.bam');
NRefs = numel(info.SequenceDictionary)
NRefs =



Use baminfo to investigate the size and content of a BAM-formatted file, including reference sequence names, before using the bamread function to read the file contents into a MATLAB structure.


[1] Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Goncalo, A., and Durbin, R. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 16, 2078–2079.

Introduced in R2010b

Was this topic helpful?