# profalign

Align two profiles using Needleman-Wunsch global alignment

## Syntax

* Prof* = profalign(

*,*

`Prof1`

*)*

`Prof2`

[

*] = profalign(*

`Prof, H1, H2`

*,*

`Prof1`

*)*

`Prof2`

profalign(..., 'ScoringMatrix',

*, ...)*

`ScoringMatrixValue`

profalign(..., 'GapOpen', {

*,*

`G1Value`

*}, ...)*

`G2Value`

profalign(..., 'ExtendGap', {

*,*

`E1Value`

*}, ...)*

`E2Value`

profalign(..., 'ExistingGapAdjust',

*, ...)*

`ExistingGapAdjustValue`

profalign(..., 'TerminalGapAdjust',

*, ...)*

`TerminalGapAdjustValue`

profalign(..., 'ShowScore',

*, ...)*

`ShowScoreValue`

## Description

returns
a new profile (* Prof* = profalign(

*,*

`Prof1`

*)*

`Prof2`

*) for the optimal global alignment of two profiles (*

`Prof`

*,*

`Prof1`

*). The profiles (*

`Prof2`

*,*

`Prof1`

*) are numeric arrays of size*

`Prof2`

```
[(4 or 5 or 20 or 21) x Profile
Length]
```

with counts or weighted profiles. Weighted profiles
are used to down-weight similar sequences and up-weight divergent
sequences. The output profile is a numeric matrix of size ```
[(5
or 21) x New Profile Length]
```

where the last row represents
gaps. Original gaps in the input profiles are preserved. The output
profile is the result of adding the aligned columns of the input profiles. `[`

returns
pointers that indicate how to rearrange the columns of the original
profiles into the new profile.* Prof, H1, H2*] = profalign(

*,*

`Prof1`

*)*

`Prof2`

`profalign(..., '`

calls * PropertyName*',

*, ...)*

`PropertyValue`

`profalign`

with optional properties
that use property name/property value pairs. You can specify one or
more properties in any order. Each *must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:*

`PropertyName`

```
profalign(..., 'ScoringMatrix',
```

defines the scoring matrix to be used for the alignment. * ScoringMatrixValue*,
...)

* ScoringMatrixValue* can be either
of the following:

Character vector or string specifying the scoring matrix to use for the alignment. Choices for amino acid sequences are:

`'BLOSUM62'`

`'BLOSUM30'`

increasing by`5`

up to`'BLOSUM90'`

`'BLOSUM100'`

`'PAM10'`

increasing by`10`

up to`'PAM500'`

`'DAYHOFF'`

`'GONNET'`

Default is:

`'BLOSUM50'`

— Whenequals`AlphabetValue`

`'AA'`

`'NUC44'`

— Whenequals`AlphabetValue`

`'NT'`

**Note**The above scoring matrices, provided with the software, also include a structure containing a scale factor that converts the units of the output score to bits. You can also use the

`'Scale'`

property to specify an additional scale factor to convert the output score from bits to another unit.Matrix representing the scoring matrix to use for the alignment, such as returned by the

`blosum`

,`pam`

,`dayhoff`

,`gonnet`

, or`nuc44`

function.**Note**If you use a scoring matrix that you created or was created by one of the above functions, the matrix does not include a scale factor. The output score will be returned in the same units as the scoring matrix.

**Note**

If you need to compile
`profalign`

into a stand-alone
application or software component using
MATLAB^{®}
Compiler™, use a matrix instead of a character
vector or string for
* ScoringMatrixValue*.

`profalign(..., 'GapOpen', {`

sets the penalties for opening a gap in the first
and second profiles respectively. * G1Value*,

*}, ...)*

`G2Value`

*and*

`G1Value`

*can be either scalars or vectors. When using a vector, the number of elements is one more than the length of the input profile. Every element indicates the position specific penalty for opening a gap between two consecutive symbols in the sequence. The first and the last elements are the gap penalties used at the ends of the sequence. The default gap open penalties are*

`G2Value`

`{10,10}`

.`profalign(..., 'ExtendGap', {`

sets the penalties for extending a gap in the first
and second profile respectively. * E1Value*,

*}, ...)*

`E2Value`

*and*

`E1Value`

*can be either scalars or vectors. When using a vector, the number of elements is one more than the length of the input profile. Every element indicates the position specific penalty for extending a gap between two consecutive symbols in the sequence. The first and the last elements are the gap penalties used at the ends of the sequence. If*

`E2Value`

`ExtendGap`

is
not specified, then extensions to gaps are scored with the same value
as `GapOpen`

. `profalign(..., 'ExistingGapAdjust', `

, if * ExistingGapAdjustValue*,
...)

*is*

`ExistingGapAdjustValue`

`false`

,
turns off the automatic adjustment based on existing gaps of the position-specific
penalties for opening a gap. When *is*

`ExistingGapAdjustValue`

`true`

(default),
for every profile position, `profalign`

proportionally
lowers the penalty for opening a gap toward the penalty of extending
a gap based on the proportion of gaps found in the contiguous symbols
and on the weight of the input profile. `profalign(..., 'TerminalGapAdjust', `

, when * TerminalGapAdjustValue*,
...)

*is*

`TerminalGapAdjustValue`

`true`

,
adjusts the penalty for opening a gap at the ends of the sequence
to be equal to the penalty for extending a gap. Default is `false`

.`profalign(..., 'ShowScore', `

, when * ShowScoreValue*,
...)

*is*

`ShowScoreValue`

`true`

,
displays the scoring space and the winning path.## Examples

Read in sequences and create profiles.

ma1 = ['RGTANCDMQDA';'RGTAHCDMQDA';'RRRAPCDL-DA']; ma2 = ['RGTHCDLADAT';'RGTACDMADAA']; p1 = seqprofile(ma1,'gaps','all','counts',true); p2 = seqprofile(ma2,'counts',true);

Merge two profiles into a single one by aligning them.

p = profalign(p1,p2); seqlogo(p)

Use the output pointers to generate the multiple alignment.

`[p, h1, h2] = profalign(p1,p2); ma = repmat('-',5,12); ma(1:3,h1) = ma1; ma(4:5,h2) = ma2; disp(ma)`

Increase the gap penalty before cysteine in the second profile.

gapVec = 10 + [p2(aa2int('C'),:) 0] * 10 p3 = profalign(p1,p2,'gapopen',{10,gapVec}); seqlogo(p3)

Add a new sequence to a profile without inserting new gaps into the profile.

gapVec = [0 inf(1,11) 0]; p4 = profalign(p3,seqprofile('PLHFMSVLWDVQQWP'),... 'gapopen',{gapVec,10}); seqlogo(p4)

## Version History

**Introduced before R2006a**