Classical multidimensional scaling


Y = cmdscale(D)
[Y,e] = cmdscale(D)


Y = cmdscale(D) takes an n-by-n distance matrix D, and returns an n-by-p configuration matrix Y. Rows of Y are the coordinates of n points in p-dimensional space for some p < n. When D is a Euclidean distance matrix, the distances between those points are given by D. p is the dimension of the smallest space in which the n points whose inter-point distances are given by D can be embedded.

[Y,e] = cmdscale(D) also returns the eigenvalues of Y*Y'. When D is Euclidean, the first p elements of e are positive, the rest zero. If the first k elements of e are much larger than the remaining (n-k), then you can use the first k columns of Y as k-dimensional points whose inter-point distances approximate D. This can provide a useful dimension reduction for visualization, e.g., for k = 2.

D need not be a Euclidean distance matrix. If it is non-Euclidean or a more general dissimilarity matrix, then some elements of e are negative, and cmdscale chooses p as the number of positive eigenvalues. In this case, the reduction to p or fewer dimensions provides a reasonable approximation to D only if the negative elements of e are small in magnitude.

You can specify D as either a full dissimilarity matrix, or in upper triangle vector form such as is output by pdist. A full dissimilarity matrix must be real and symmetric, and have zeros along the diagonal and positive elements everywhere else. A dissimilarity matrix in upper triangle form must have real, positive entries. You can also specify D as a full similarity matrix, with ones along the diagonal and all other elements less than one. cmdscale transforms a similarity matrix to a dissimilarity matrix in such a way that distances between the points returned in Y equal or approximate sqrt(1-D). To use a different transformation, you must transform the similarities prior to calling cmdscale.


Generate some points in 4-D space, but close to 3-D space, then reduce them to distances only.

X = [normrnd(0,1,10,3) normrnd(0,.1,10,1)];
D = pdist(X,'euclidean');

Find a configuration with those inter-point distances.

[Y,e] = cmdscale(D);

% Four, but fourth one small
dim = sum(e > eps^(3/4))

% Poor reconstruction
maxerr2 = max(abs(pdist(X)-pdist(Y(:,1:2)))) 

% Good reconstruction
maxerr3 = max(abs(pdist(X)-pdist(Y(:,1:3)))) 

% Exact reconstruction
maxerr4 = max(abs(pdist(X)-pdist(Y)))

% D is now non-Euclidean
D = pdist(X,'cityblock');
[Y,e] = cmdscale(D);

% One is large negative

% Poor reconstruction
maxerr = max(abs(pdist(X)-pdist(Y)))


[1] Seber, G. A. F. Multivariate Observations. Hoboken, NJ: John Wiley & Sons, Inc., 1984.

See Also

| |

Introduced before R2006a

Was this topic helpful?