The `procrustes`

function
analyzes the distribution of a set of shapes using Procrustes analysis.
This analysis method matches landmark data (geometric locations representing
significant features in a given shape) to calculate the best shape-preserving
Euclidian transformations. These transformations minimize the differences
in location between compared landmark data.

Procrustes analysis is also useful in conjunction with multidimensional
scaling. In Example: Multidimensional Scaling there is an observation
that the orientation of the reconstructed points is arbitrary. Two
different applications of multidimensional scaling could produce reconstructed
points that are very similar in principle, but that look different
because they have different orientations. The `procrustes`

function
transforms one set of points to make them more comparable to the other.

The `procrustes`

function
takes two matrices as input:

The target shape matrix

*X*has dimension`n`

×`p`

, where`n`

is the number of landmarks in the shape and`p`

is the number of measurements per landmark.The comparison shape matrix

*Y*has dimension`n`

×`q`

with`q`

≤`p`

. If there are fewer measurements per landmark for the comparison shape than the target shape (`q`

<`p`

), the function adds columns of zeros to*Y*, yielding an`n`

×`p`

matrix.

The equation to obtain the transformed shape, *Z*,
is

$$Z=bYT+c$$ | (15-1) |

where:

*b*is a scaling factor that stretches (*b*> 1) or shrinks (*b*< 1) the points.*T*is the orthogonal rotation and reflection matrix.*c*is a matrix with constant values in each column, used to shift the points.

The `procrustes`

function chooses *b*, *T*,
and *c* to minimize the distance between the target
shape *X* and the transformed shape *Z* as
measured by the least squares criterion:

$$\sum _{i=1}^{n}{\displaystyle \sum _{j=1}^{p}{({X}_{ij}-{Z}_{ij})}^{2}}$$

Procrustes analysis is appropriate when all `p`

measurement
dimensions have similar scales. The analysis would be inaccurate,
for example, if the columns of *Z* had different
scales:

The first column is measured in milliliters ranging from 2,000 to 6,000.

The second column is measured in degrees Celsius ranging from 10 to 25.

The third column is measured in kilograms ranging from 50 to 230.

In such cases, standardize your variables by:

Subtracting the sample mean from each variable.

Dividing each resultant variable by its sample standard deviation.

Use the `zscore`

function
to perform this standardization.

Was this topic helpful?