Thread Subject: Mahalanobis Distance Calculation

Subject: Mahalanobis Distance Calculation

From: Baris Yenidunya

Date: 29 Aug, 2008 16:50:18

Message: 1 of 10

Hi,

I'm trying to understand how mahal function in MATLAB works.
I see that the method used is different than the MD
calculations in the literature. How is QR decomposition used
here? Any help (maybe references, links etc.) will be
appreciated.

Thanks,

Baris Yenidunya

Subject: Mahalanobis Distance Calculation

From: Peter Perkins

Date: 2 Sep, 2008 17:49:45

Message: 2 of 10

Baris Yenidunya wrote:
> Hi,
>
> I'm trying to understand how mahal function in MATLAB works.
> I see that the method used is different than the MD
> calculations in the literature. How is QR decomposition used
> here? Any help (maybe references, links etc.) will be
> appreciated.

Baris, presumably by "in the literature", you mean formulas like the one that appears in the help for MAHAL. That's fine for understanding what the Mahalanobis distance is, but not so good for computational purposes. MAHAL rewrites that explicit inversion of the cov matrix as a solution of a linear system, by noticing that the (squared) Mahalanobis distance can be expressed as

D = Y0*S^(-1)*Y0'
  = Y0*(X0'*X0)^(-1)*Y0'
  = Y0*((Q*R)'*(Q*R))^(-1)*Y0'
  = Y0*(R'*R)^(-1)*Y0'
  = Y0*R^(-1) * (Y0*R^(-1))'

and then noticing that you can do that using MATLAB's backslash operator.

Hope this helps.

Subject: Mahalanobis Distance Calculation

From: Baris Yenidunya

Date: 3 Sep, 2008 18:35:04

Message: 3 of 10

Well it really does, thanks a lot Peter

Baris Yenidunya

Subject: Mahalanobis Distance Calculation

From: Dilber Ayhan

Date: 14 Sep, 2008 15:15:04

Message: 4 of 10

Mr. Perkins,

after your explanation to my friend Barış, I wonder that in the matlab function of mahal, Why the equation is multiplied with (rx-1)? I did not find where it comes.

the equation;

ri = R'\(Y-M)';
d = sum(ri.*ri,1)'*(rx-1);
I think sum(ri.*ri,1)' is just Y0*R^(-1) * (Y0*R^(-1))'


but what about (rx-1)?

thanks

dilber ayhan


Peter Perkins <Peter.PerkinsRemoveThis@mathworks.com> wrote in message <g9jubp$pkv$1@fred.mathworks.com>...
> Baris Yenidunya wrote:
> > Hi,
> >
> > I'm trying to understand how mahal function in MATLAB works.
> > I see that the method used is different than the MD
> > calculations in the literature. How is QR decomposition used
> > here? Any help (maybe references, links etc.) will be
> > appreciated.
>
> Baris, presumably by "in the literature", you mean formulas like the one that appears in the help for MAHAL. That's fine for understanding what the Mahalanobis distance is, but not so good for computational purposes. MAHAL rewrites that explicit inversion of the cov matrix as a solution of a linear system, by noticing that the (squared) Mahalanobis distance can be expressed as
>
> D = Y0*S^(-1)*Y0'
> = Y0*(X0'*X0)^(-1)*Y0'
> = Y0*((Q*R)'*(Q*R))^(-1)*Y0'
> = Y0*(R'*R)^(-1)*Y0'
> = Y0*R^(-1) * (Y0*R^(-1))'
>
> and then noticing that you can do that using MATLAB's backslash operator.
>
> Hope this helps.

Subject: Mahalanobis Distance Calculation

From: Dilber Ayhan

Date: 14 Sep, 2008 19:38:01

Message: 5 of 10

Hi,

as a second question, is multicollinearity prevented by using QR decomposition in mahal function? I knew it works, but using my data set, mahal function did not solve with mahal function and gave an error as "the matrix is singular"
since there is multicollinearity (since correlation matrix includes 1s)
thanks,

dilber ayhan

Peter Perkins <Peteal.PerkinsRemoveThis@mathworks.com> wrote in message <g9jubp$pkv$1@fred.mathworks.com>...
> Baris Yenidunya wrote:
> > Hi,
> >
> > I'm trying to understand how mahal function in MATLAB works.
> > I see that the method used is different than the MD
> > calculations in the literature. How is QR decomposition used
> > here? Any help (maybe references, links etc.) will be
> > appreciated.
>
> Baris, presumably by "in the literature", you mean formulas like the one that appears in the help for MAHAL. That's fine for understanding what the Mahalanobis distance is, but not so good for computational purposes. MAHAL rewrites that explicit inversion of the cov matrix as a solution of a linear system, by noticing that the (squared) Mahalanobis distance can be expressed as
>
> D = Y0*S^(-1)*Y0'
> = Y0*(X0'*X0)^(-1)*Y0'
> = Y0*((Q*R)'*(Q*R))^(-1)*Y0'
> = Y0*(R'*R)^(-1)*Y0'
> = Y0*R^(-1) * (Y0*R^(-1))'
>
> and then noticing that you can do that using MATLAB's backslash operator.
>
> Hope this helps.

Subject: Mahalanobis Distance Calculation

From: Peter Perkins

Date: 15 Sep, 2008 01:13:34

Message: 6 of 10

Dilber Ayhan wrote:

> but what about (rx-1)?

I inadvertently left that out of my response to Baris.

> Peter Perkins <Peter.PerkinsRemoveThis@mathworks.com> wrote in message <g9jubp$pkv$1@fred.mathworks.com>...

>> D = Y0*S^(-1)*Y0'
>> = Y0*(X0'*X0)^(-1)*Y0'

Obviously, the estimate of S is (X0'*X0)/(n-1).

Subject: Mahalanobis Distance Calculation

From: Peter Perkins

Date: 15 Sep, 2008 01:18:37

Message: 7 of 10

Dilber Ayhan wrote:
> Hi,
>
> as a second question, is multicollinearity prevented by using QR decomposition in mahal function? I knew it works, but using my data set, mahal function did not solve with mahal function and gave an error as "the matrix is singular"
> since there is multicollinearity (since correlation matrix includes 1s)

There may be a standard or unique or useful way to define the Mahalanobis distance for a singular cov matrix, but I'm not familiar with it. You could perhaps compute a distance along the degenerate subspace using a reduced cov matrix; MAHAL does not do that.

Subject: Mahalanobis Distance Calculation

From: Dilber Ayhan

Date: 17 Sep, 2008 20:15:20

Message: 8 of 10

Thanks for your replies, Mr. Perkins

regards,

dilber ayhan


Peter Perkins <Peter.PerkinsRemoveThis@mathworks.com> wrote in message <gakd5d$khh$1@fred.mathworks.com>...
> Dilber Ayhan wrote:
> > Hi,
> >
> > as a second question, is multicollinearity prevented by using QR decomposition in mahal function? I knew it works, but using my data set, mahal function did not solve with mahal function and gave an error as "the matrix is singular"
> > since there is multicollinearity (since correlation matrix includes 1s)
>
> There may be a standard or unique or useful way to define the Mahalanobis distance for a singular cov matrix, but I'm not familiar with it. You could perhaps compute a distance along the degenerate subspace using a reduced cov matrix; MAHAL does not do that.

Subject: Mahalanobis Distance Calculation

From: Dilber Ayhan

Date: 4 Jan, 2009 22:44:02

Message: 9 of 10

Hi Peter,

in the calculation of MD, the values of Y should be standardized, since it is encouraged. This means;

Z=(Y-M)/stdev should be used with covariance function
But in the mahal function it isnot divided by standard deviation of Y values?If we want to use matlab mahal function, should we add this in to the formula, or it is already added?

Can you just explain this?

Thank you for your reply.

Peter Perkins <Peter.PerkinsRemoveThis@mathworks.com> wrote in message <gakcru$h5o$1@fred.mathworks.com>...
> Dilber Ayhan wrote:
>
> > but what about (rx-1)?
>
> I inadvertently left that out of my response to Baris.
>
> > Peter Perkins <Peter.PerkinsRemoveThis@mathworks.com> wrote in message <g9jubp$pkv$1@fred.mathworks.com>...
>
> >> D = Y0*S^(-1)*Y0'
> >> = Y0*(X0'*X0)^(-1)*Y0'
>
> Obviously, the estimate of S is (X0'*X0)/(n-1).

Subject: Mahalanobis Distance Calculation

From: Peter Perkins

Date: 5 Jan, 2009 17:21:19

Message: 10 of 10

Dilber Ayhan wrote:
> Hi Peter,
>
> in the calculation of MD, the values of Y should be standardized, since it is encouraged. This means;
>
> Z=(Y-M)/stdev should be used with covariance function
> But in the mahal function it isnot divided by standard deviation of Y values?If we want to use matlab mahal function, should we add this in to the formula, or it is already added?
>
> Can you just explain this?

Dilber, I'm not sure what you mean. There's a standard formula for the Mahalanobis distance; my original reply transforms that from a textbook formula into a computationally useful one. You may be referring to the fact that I used the notation Y0 to refer to a "centered" version of Y, but neglected to explain that notation. If you want to compute the MD by also standardizing with respect to variance, and then using the correlation matrix, that's would be perfectly reasonable and equivalent. But that isn't what you've described above.

Tags for this Thread

Everyone's Tags:

Add a New Tag:

Separated by commas
Ex.: root locus, bode

What are tags?

A tag is like a keyword or category label associated with each thread. Tags make it easier for you to find threads of interest.

Anyone can tag a thread. Tags are public and visible to everyone.

Tag Activity for This Thread
Tag Applied By Date/Time
mahal function Dilber Ayhan 14 Sep, 2008 11:15:06
mahalanobis dis... Baris Yenidunya 29 Aug, 2008 12:55:03
mahal Baris Yenidunya 29 Aug, 2008 12:55:03
rssFeed for this Thread
 

MATLAB Central Terms of Use

NOTICE: Any content you submit to MATLAB Central, including personal information, is not subject to the protections which may be afforded information collected under other sections of The MathWorks, Inc. Web site. You are entirely responsible for all content that you upload, post, e-mail, transmit or otherwise make available via MATLAB Central. The MathWorks does not control the content posted by visitors to MATLAB Central and, does not guarantee the accuracy, integrity, or quality of such content. Under no circumstances will The MathWorks be liable in any way for any content not authored by The MathWorks, or any loss or damage of any kind incurred as a result of the use of any content posted, e-mailed, transmitted or otherwise made available via MATLAB Central. Read the complete Terms prior to use.

Contact us at files@mathworks.com