It calculates the Spearman rank correlation coefficient.
28 Downloads
Updated 10 Oct 2006
No LicenseIt calculates the Spearman rank correlation coefficient from 2 or more data sets, and the associated t-test and p-values. The code is adapted with major changes from the Numerical Recipes book (http://www.nr.com/)
Example:
>> x = [1 2 3 3 3]';
>> y = [1 2 2 4 3; rand(1,5)]';
>> [r,t,p] = spear(x,y)
>> [r,t,p]=spear(x,y)
r =
0.8250 -0.6000
t =
2.5285 -1.2990
p =
0.0855 0.2848
Spelling corrections. Thanks to Phillip Feldman! |
Minwei CHEN (view profile)
it works
Daniel (view profile)
If you want to compute tied ranks along a specific dimension of an ND-array see http://www.mathworks.com/matlabcentral/fileexchange/34560-tiedrankxdim.
Richard Crozier (view profile)
This might be useful if it did not require the stats toolbox which has it's own functions for calculating this.
Ethan Meyers (view profile)
This version of Spearman's correlation gives incorrect results if there are tied values (which is very likely in many applications). It is much better to use Matlab's Spearman's correlation function as follows corr(X, 'type', Spearman').
I would like to calculate the spearman correlation of two matrix, how can I?
Warning if you have missing values! This program gave significant correlation with a column of NaNs!
im using matlab2006a, when i type in the coding it appear
"Undefined command/function 'spear'"
can i know why?
please advise..
I cross-checked the results with SPSS 14 on a trial basis and got slightly different values. That's because this version doesn't account for tied ranks (which occured in my test), I suppose.
Good reading, very informative
tdcf is a function in the matlab Statistics Toolbox
It does not handle NAN values. See corrected *.m file below (it handle NAN):
*************************************
function [r,t,p]=spear(x,y)
%Syntax: [r,t,p]=spear(x,y)
%__________________________
%
% Spearman's rank correalation coefficient.
%
% r is the Spearman's rank correlation coefficient.
% t is the t-ratio of r.
% p is the corresponding p-value.
% x is the first data series (column).
% y is the second data series, a matrix which may contain one or multiple
% columns.
%
%
% Reference:
% Press W. H., Teukolsky S. A., Vetterling W. T., Flannery B. P.(1996):
% Numerical Recipes in C, Cambridge University Press. Page 641.
%
%
% Example:
% x = [1 2 3 3 3]';
% y = [1 2 2 4 3; rand(1,5)]';
% [r,t,p] = spear(x,y)
%
%
% Products Required:
% Statistics Toolbox
%
% Alexandros Leontitsis
% Department of Education
% University of Ioannina
% 45110- Dourouti
% Ioannina
% Greece
%
% University e-mail: me00743@cc.uoi.gr
% Lifetime e-mail: leoaleq@yahoo.com
% Homepage: http://www.geocities.com/CapeCanaveral/Lab/1421
%
% 3 Feb 2002.
for i=1:size(y,2)
% quito NaN:
x=x(:);
yi=y(:,i);
H=find(x>=0 & yi>=0);
x=x(H);
yi=yi(H);
% x and y must have equal number of rows
if size(x,1)~=size(yi,1)
error('x and y must have equal number of rows.');
end
% Find the data length
N = length(x);
% Get the ranks of x
R = crank(x)';
% Get the ranks of y
S = crank(yi)';
% Calculate the correlation coefficient
r(i) = 1-6*sum((R-S).^2)/N/(N^2-1);
end
% Calculate the t statistic
if r == 1 | r == -1
t = r*inf;
else
t=r.*sqrt((N-2)./(1-r.^2));
end
% Calculate the p-values
p=2*(1-tcdf(abs(t),N-2));
function r=crank(x)
%Syntax: r=crank(x)
%__________________
%
% Assigns ranks on a data series x.
%
% r is the vector of the ranks
% x is the data series. It must be sorted.
%
%
% Reference:
% Press W. H., Teukolsky S. A., Vetterling W. T., Flannery B. P.(1996):
% Numerical Recipes in C, Cambridge University Press. Page 642.
%
%
% Alexandros Leontitsis
% Department of Education
% University of Ioannina
% 45110- Dourouti
% Ioannina
% Greece
%
% University e-mail: me00743@cc.uoi.gr
% Lifetime e-mail: leoaleq@yahoo.com
% Homepage: http://www.geocities.com/CapeCanaveral/Lab/1421
%
% 3 Feb 2002.
u = unique(x);
[xs,z1] = sort(x);
[z1,z2] = sort(z1);
r = (1:length(x))';
r=r(z2);
for i=1:length(u)
s=find(u(i)==x);
r(s,1) = mean(r(s));
end
***********************************
It is a helpful and actual-working-fine program. I ran the SPSS cp the result, and it is matched. for tcdf : google it, you will find a shared code. Just download it and add it in your work folder. For input example: x = [1 2 3]';y = [8 2 7]';spear(x,y);
I found this error
??? Undefined command/function 'tcdf'.
Error in ==> spear at 91
p=2*(1-tcdf(abs(t),N-2));
Very helpful. Too bad that the function did not automatically handle rows instead of columns, but this was easily corrected.
A small point: the normal approximation for the probability values is valid for n>30 (Conover, Practical Non-parametric Statistics, 1980, Table A10). I would suggest alerting the reader/user to this in returning the p value. A modification of the code to use the rank-based critical values for Spearman's rho would be trivial (I'm going to do for myself, anyway). For Linda S., the difference in Pearson's r (corr function) and Spearman's rho diminishes as sample size grows so that the discrete ranks-based distribution becomes more and more like a continuous pdf. Try it with synthetic data for yourself.
Nice implementation straight from Numerical Recipes. A simpler approach with the same result:
function [rho, rhop, rholo, rhoup] = spear(X)
% X is a matrix whose columns are variables
% and whose rows are observations
% Generate the ranks matrix Z for each column
% of X. Tying ranks are averaged as is
% standard.
Y = sort(X);
for col = 1:size(X,2)
for row = 1:size(X,1)
Z(X(:,col) == X(row,col),col) = mean(find(Y(:,col) == X(row,col)));
end
end
[rho, rhop, rholo, rhoup] = corrcoef(Z);
The built-in Matlab function corr() can calculate Spearman's rho like so:
[rho, pvalue] = corr(x, y, 'type', 'spearman')
What's a bit confusing is that p-values produced by spear() are different from those from corr():
>> x = [1 2 3 4 5 ]';
>> y = [8 6 7 5 4]';
>> [rho, pvalue] = corr(x, y, 'type', 'spearman')
rho =
-0.9000
pvalue =
0.0833
>>
>>
>> [r, t, p] = spear(x,y)
r =
-0.9000
t =
-3.5762
p =
0.0374
Any idea what's going on?
xp
please ignore my comments. i was drunk!
err..this is not spearman but pearson cc
Thanks. Note that the script needs Statistic toolbox
as it uses a function "tcdf".
With the toolbox, it can work on Release 13.
I got this error using Matlab release 13:
EDU>> spear A B
Warning: Divide by zero.
(Type "warning off MATLAB:divideByZero" to suppress this warning.)
> In C:\matlab_sv13\work\spear.m at line 54
??? Undefined function or variable 'tcdf'.
Error in ==> C:\matlab_sv13\work\spear.m
On line 68 ==> p=2*(1-tcdf(abs(t),N-2));
maybe "tcdf" is a Matlab release 14 script/function?
Thanks
Nice to use, but p significance value is incorrect for small N (<10)
note that input matrixes should be n-by-1 for the list one is comparing with, and n-by-x where there are x rank correlations one wishes to compute. outputs are [correlation, t-value, p-value].
Glad to find it.
It is the excellent one
very useful