|
"Fredrik Åhs" <fredrik.ahs@psyk.uu.se> wrote in message
news:<3e6e671d$1@puffinus.its.uu.se>...
> I want to know the bhattacharyya distance between pairs of variables in 2
> large data matrices that I call X1 and X2.
Why?
Distances are usually used to measure dissimilarities between either
1. two data sets (S1 & S2)in an n-dimensional space
or
2. an arbitrary n-dimensional vector and either data set
Correlation coefficients are usually used to measure similarities
between variables in either
1. the union of the two sets
or
2. one of the data sets
Usually the members of S1 are the m rows of the m x n matrix X1 and
the r rows of the r x n matrix X2 are the members of S2.
> So I want to know the distance
> between [X1(1,:);X1(2,:)] and [X2(1,:);X2(2,:)], then [X1(1,:);X1(3,:)] and
> [X2(1,:);X2(3,:)], and so on.
This doesn't make sense to me. Since the second index is common
to both sets, it must be the index for variables, not set members.
If you want to measure the similarity/dissimilarity of the variables
i and j in set S1 with the ij pair in S2, then you'll want m x 2
submatrix [X1(:,i),X1(:,j)] and r x 2 submatrix [X2(:,i),X2(:,j)].
Is this what you are interested in?
> I now use the function below, which I have to
> call for every new pair of variables that I can form from my data matrices.
>
> Does any body know of some faster way to do this?
> function bhattacharyyaDist=bhattacharyya(m1,m2,Cx1,Cx2)
>
> % This function computes the bhattacharyya distance between
>
> % two data sets X1 and X2
Yes. Data sets. Not variables.
> with means m1 and m2 and covariance
>
> % matrices Cx1 and Cx2.
>
> bhattacharyyaDist=0.25*(m2-m1)'*inv(Cx1+Cx2)*(m2-m1)+0.5*...
>
> (det(Cx1+Cx2)/2*(sqrt(det(Cx1)*det(Cx2))));
Given m1,m2,Cx1 and Cx2 Think about the following
1. Define C = (Cx1 + Cx2)/2 and d = m2 - m1 to prevent repeating
operations.
Db = 0.5*( d.*z + det(C) / sqrt(det(Cx1)*det(Cx2)))
where z is the solution to
A*z = d
Therefore, explicit inversion is avoided.
Hope this helps.
Greg
|