# Documentation

### This is machine translation

Translated by
Mouseover text to see original. Click the button below to return to the English verison of the page.

# mahal

Mahalanobis distance

## Syntax

```d = mahal(Y,X) ```

## Description

`d = mahal(Y,X)` computes the Mahalanobis distance (in squared units) of each observation in `Y` from the reference sample in matrix `X`. If `Y` is n-by-m, where n is the number of observations and m is the dimension of the data, `d` is n-by-1. `X` and `Y` must have the same number of columns, but can have different numbers of rows. `X` must have more rows than columns.

For observation `I`, the Mahalanobis distance is defined by `d(I) = (Y(I,:)-mu)*inv(SIGMA)*(Y(I,:)-mu)'`, where `mu` and `SIGMA` are the sample mean and covariance of the data in `X`. `mahal` performs an equivalent, but more efficient, computation.

## Examples

collapse all

Generate correlated bivariate data.

```X = mvnrnd([0;0],[1 .9;.9 1],100); ```

Input observations.

```Y = [1 1;1 -1;-1 1;-1 -1]; ```

Compute the Mahalanobis distance of observations in `Y` from the reference sample in `X` .

```d1 = mahal(Y,X) ```
```d1 = 0.6288 19.3520 21.1384 0.9404 ```

Compute their squared Euclidean distances from the mean of `X` .

```d2 = sum((Y-repmat(mean(X),4,1)).^2, 2) ```
```d2 = 1.6170 1.9334 2.1094 2.4258 ```

Plot the observations with `Y` values colored according to the Mahalanobis distance.

```scatter(X(:,1),X(:,2)) hold on scatter(Y(:,1),Y(:,2),100,d1,'*','LineWidth',2) hb = colorbar; ylabel(hb,'Mahalanobis Distance') legend('X','Y','Location','NW') ```

The observations in `Y` with equal coordinate values are much closer to `X` in Mahalanobis distance than observations with opposite coordinate values, even though all observations are approximately equidistant from the mean of `X` in Euclidean distance. The Mahalanobis distance, by considering the covariance of the data and the scales of the different variables, is useful for detecting outliers in such cases.