Convert aligned sequences to corresponding Compact Idiosyncratic Gapped Alignment Report (CIGAR) format
aligned sequences represented in
cell array of aligned character vectors or a character array, into
a cell array of corresponding CIGAR–formatted character vectors,
using the reference sequence specified by
a character vector. It also returns
vector of integers indicating the start position of each aligned sequence
with respect to the ungapped reference sequence.
Cell array of character vector or a character array representing aligned sequences. Soft clippings are assumed to be represented by lowercase letters in the aligned sequences. Skipped positions are assumed to be represented by . in the aligned sequences.
Character vector specifying an aligned reference sequence. The
Cell array of CIGAR-formatted character vectors corresponding
to each aligned sequence in
Vector of integers indicating the start position of each aligned sequence with respect to the ungapped reference sequence.
This example shows how to convert aligned strings to CIGAR strings
Create a cell array of aligned strings, create a string specifying a reference sequence, and then convert the alignment to CIGAR strings:
aln = ['ACG-ATGC'; 'ACGT-TGC'; ' GTAT-C']
aln = ACG-ATGC ACGT-TGC GTAT-C
ref = 'ACGTATGC'; [cigar, start] = align2cigar(aln, ref)
cigar = 1×3 cell array '3M1D4M' '4M1D3M' '4M1D1M' start = 1 1 3
 Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Goncalo, A., and Durbin, R. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 16, 2078–2079.