To calculate mahalanobis distance when the number of observations are less than the dimension
32 views (last 30 days)
I am in the field of neuroscience and the data I am working on has the number of trials (or observations) less than the number of neurons (or dimensions). When I use mahal function on my data, I get the following error:
Error using mahal (line 38) The number of rows of X must exceed the number of columns.
Instead of posting my data, which is huge, you can run the following code that issues the same error.
A = rand(1,100); % new data
B = rand(10,100); % 10 observations and dimension 100
d = mahal(A,B);
John D'Errico on 31 Mar 2017
Edited: John D'Errico on 31 Mar 2017
The problem is Mahalanobis distance is not defined in your case.
You can't compute a meaningful distance when the result would be undefined. Why do I say this? A Mahalanobis distance requires a covariance matrix. A NON-singular covariance matrix. If your matrix is singular, then the computation will produce garbage, since you cannot invert a singular matrix. Since you don't have sufficient data to estimate a complete covariance matrix, mahal must fail.
Think about it in terms of what a mahalanobis distance means, and what a singular covariance matrix tells you. A singular covariance matrix tells you have NO information in some spatial directions about the system under study. So when you try to invert that, you get infinities, essentially infinite uncertainty.
More Answers (2)
Ilya on 30 Aug 2017
For classification, use regularized discriminant or pseudo discriminant. Both options are supported in fitcdiscr. Regularization add a positive value to the diagonal of the covariance matrix to make it full-rank. Pseudo discriminant amounts to taking pinv.