Documentation |
On this page… |
---|
The procrustes function analyzes the distribution of a set of shapes using Procrustes analysis. This analysis method matches landmark data (geometric locations representing significant features in a given shape) to calculate the best shape-preserving Euclidian transformations. These transformations minimize the differences in location between compared landmark data.
Procrustes analysis is also useful in conjunction with multidimensional scaling. In Example: Multidimensional Scaling there is an observation that the orientation of the reconstructed points is arbitrary. Two different applications of multidimensional scaling could produce reconstructed points that are very similar in principle, but that look different because they have different orientations. The procrustes function transforms one set of points to make them more comparable to the other.
The procrustes function takes two matrices as input:
The target shape matrix X has dimension n × p, where n is the number of landmarks in the shape and p is the number of measurements per landmark.
The comparison shape matrix Y has dimension n × q with q ≤ p. If there are fewer measurements per landmark for the comparison shape than the target shape (q < p), the function adds columns of zeros to Y, yielding an n × p matrix.
The equation to obtain the transformed shape, Z, is
$$Z=bYT+c$$ | (13-1) |
where:
b is a scaling factor that stretches (b > 1) or shrinks (b < 1) the points.
T is the orthogonal rotation and reflection matrix.
c is a matrix with constant values in each column, used to shift the points.
The procrustes function chooses b, T, and c to minimize the distance between the target shape X and the transformed shape Z as measured by the least squares criterion:
$$\sum _{i=1}^{n}{\displaystyle \sum _{j=1}^{p}{({X}_{ij}-{Z}_{ij})}^{2}}$$
Procrustes analysis is appropriate when all p measurement dimensions have similar scales. The analysis would be inaccurate, for example, if the columns of Z had different scales:
The first column is measured in milliliters ranging from 2,000 to 6,000.
The second column is measured in degrees Celsius ranging from 10 to 25.
The third column is measured in kilograms ranging from 50 to 230.
In such cases, standardize your variables by:
Subtracting the sample mean from each variable.
Dividing each resultant variable by its sample standard deviation.
Use the zscore function to perform this standardization.
In this example, use Procrustes analysis to compare two handwritten number threes. Visually and analytically explore the effects of forcing size and reflection changes as follows:
Input landmark data for two handwritten number threes:
A = [11 39;17 42;25 42;25 40;23 36;19 35;30 34;35 29;... 30 20;18 19]; B = [15 31;20 37;30 40;29 35;25 29;29 31;31 31;35 20;... 29 10;25 18];
Create X and Y from A and B, moving B to the side to make each shape more visible:
X = A; Y = B + repmat([25 0], 10,1);
Plot the shapes, using letters to designate the landmark points. Lines in the figure join the points to indicate the drawing path of each shape.
plot(X(:,1), X(:,2),'r-', Y(:,1), Y(:,2),'b-'); text(X(:,1), X(:,2),('abcdefghij')') text(Y(:,1), Y(:,2),('abcdefghij')') legend('X = Target','Y = Comparison','location','SE') set(gca,'YLim',[0 55],'XLim',[0 65]);
Use Procrustes analysis to find the transformation that minimizes distances between landmark data points.
Call procrustes as follows:
[d, Z, tr] = procrustes(X,Y);
The outputs of the function are:
d – A standardized dissimilarity measure.)
Z – A matrix of the transformed landmarks.
tr – A structure array of the computed transformation with fields T, b, and c which correspond to the transformation equation, Equation 13-1.
Visualize the transformed shape, Z, using a dashed blue line:
plot(X(:,1), X(:,2),'r-', Y(:,1), Y(:,2),'b-',... Z(:,1),Z(:,2),'b:'); text(X(:,1), X(:,2),('abcdefghij')') text(Y(:,1), Y(:,2),('abcdefghij')') text(Z(:,1), Z(:,2),('abcdefghij')') legend('X = Target','Y = Comparison',... 'Z = Transformed','location','SW') set(gca,'YLim',[0 55],'XLim',[0 65]);
Use two different numerical values to assess the similarity of the target shape and the transformed shape.
Dissimilarity Measure d. The dissimilarity measure d gives a number between 0 and 1 describing the difference between the target shape and the transformed shape. Values near 0 imply more similar shapes, while values near 1 imply dissimilarity. For this example:
d = 0.1502
The small value of d in this case shows that the two shapes are similar.
procrustes calculates d by comparing the sum of squared deviations between the set of points with the sum of squared deviations of the original points from their column means:
numerator = sum(sum((X-Z).^2)) numerator = 166.5321 denominator = sum(sum(bsxfun(@minus,X,mean(X)).^2)) denominator = 1.1085e+003 ratio = numerator/denominator ratio = 0.1502
Note: The resulting measure d is independent of the scale of the size of the shapes and takes into account only the similarity of landmark data. Examine the Scaling Measure b shows how to examine the size similarity of the shapes. |
Examine the Scaling Measure b. The target and comparison threes in the previous figure visually show that the two numbers are of a similar size. The closeness of calculated value of the scaling factor b to 1 supports this observation as well:
tr.b ans = 0.9291
The sizes of the target and comparison shapes appear similar. This visual impression is reinforced by the value of b = 0.93, which implies that the best transformation results in shrinking the comparison shape by a factor .93 (only 7%).
Explore the effects of manually adjusting the scaling and reflection coefficients.
Fix the Scaling Factor b = 1. Force b to equal 1 (set 'Scaling' to false) to examine the amount of dissimilarity in size of the target and transformed figures:
ds = procrustes(X,Y,'Scaling',false) ds = 0.1552
In this case, setting 'Scaling' to false increases the calculated value of d only 0.0049, which further supports the similarity in the size of the two number threes. A larger increase in d would have indicated a greater size discrepancy.
Force a Reflection in the Transformation. This example requires only a rotation, not a reflection, to align the shapes. You can show this by observing that the determinant of the matrix T is 1 in this analysis:
det(tr.T) ans = 1.0000
If you need a reflection in the transformation, the determinant of T is -1. You can force a reflection into the transformation as follows:
[dr,Zr,trr] = procrustes(X,Y,'Reflection',true); dr dr = 0.8130
The d value increases dramatically, indicating that a forced reflection leads to a poor transformation of the landmark points. A plot of the transformed shape shows a similar result:
The landmark data points are now further away from their target counterparts.
The transformed three is now an undesirable mirror image of the target three.
plot(X(:,1), X(:,2),'r-', Y(:,1), Y(:,2),'b-',... Zr(:,1),Zr(:,2),'b:'); text(X(:,1), X(:,2),('abcdefghij')') text(Y(:,1), Y(:,2),('abcdefghij')') text(Zr(:,1), Zr(:,2),('abcdefghij')') legend('X = Target','Y = Comparison',... 'Z = Transformed','location','SW') set(gca,'YLim',[0 55],'XLim',[0 65]);
It appears that the shapes might be better matched if you flipped the transformed shape upside down. Flipping the shapes would make the transformation even worse, however, because the landmark data points would be further away from their target counterparts. From this example, it is clear that manually adjusting the scaling and reflection parameters is generally not optimal.