No BSD License  

3.85

3.9 | 20 ratings Rate this file 78 Downloads (last 30 days) File Size: 2.31 KB File ID: #4374

Spearman Rank Correlation

by

 

15 Jan 2004 (Updated )

It calculates the Spearman rank correlation coefficient.

| Watch this File

File Information
Description

It calculates the Spearman rank correlation coefficient from 2 or more data sets, and the associated t-test and p-values. The code is adapted with major changes from the Numerical Recipes book (http://www.nr.com/)

Example:
>> x = [1 2 3 3 3]';
>> y = [1 2 2 4 3; rand(1,5)]';
>> [r,t,p] = spear(x,y)

>> [r,t,p]=spear(x,y)

r =

    0.8250 -0.6000

t =

    2.5285 -1.2990

p =

    0.0855 0.2848

Required Products Statistics Toolbox
MATLAB release MATLAB 7 (R14)
Tags for This File   Please login to tag files.
Please login to add a comment or rating.
Comments and Ratings (28)
21 Nov 2012 Minwei CHEN

it works

31 Jan 2012 Daniel

If you want to compute tied ranks along a specific dimension of an ND-array see http://www.mathworks.com/matlabcentral/fileexchange/34560-tiedrankxdim.

18 Feb 2010 Richard Crozier

This might be useful if it did not require the stats toolbox which has it's own functions for calculating this.

11 Dec 2009 Ethan Meyers

This version of Spearman's correlation gives incorrect results if there are tied values (which is very likely in many applications). It is much better to use Matlab's Spearman's correlation function as follows corr(X, 'type', Spearman').

27 May 2008 Josue Alvarez

I would like to calculate the spearman correlation of two matrix, how can I?

13 Mar 2008 Elisabeth Larsson

Warning if you have missing values! This program gave significant correlation with a column of NaNs!

04 Oct 2007 Lana cfai

im using matlab2006a, when i type in the coding it appear
"Undefined command/function 'spear'"
can i know why?
please advise..

16 Aug 2007 Robert Schleicher

I cross-checked the results with SPSS 14 on a trial basis and got slightly different values. That's because this version doesn't account for tied ranks (which occured in my test), I suppose.

21 Feb 2007 Priscilla Mabhande

Good reading, very informative

16 Feb 2007 Stephen Cowen

tdcf is a function in the matlab Statistics Toolbox

03 Jan 2007 Sergio Vallina

It does not handle NAN values. See corrected *.m file below (it handle NAN):

*************************************
function [r,t,p]=spear(x,y)
%Syntax: [r,t,p]=spear(x,y)
%__________________________
%
% Spearman's rank correalation coefficient.
%
% r is the Spearman's rank correlation coefficient.
% t is the t-ratio of r.
% p is the corresponding p-value.
% x is the first data series (column).
% y is the second data series, a matrix which may contain one or multiple
% columns.
%
%
% Reference:
% Press W. H., Teukolsky S. A., Vetterling W. T., Flannery B. P.(1996):
% Numerical Recipes in C, Cambridge University Press. Page 641.
%
%
% Example:
% x = [1 2 3 3 3]';
% y = [1 2 2 4 3; rand(1,5)]';
% [r,t,p] = spear(x,y)
%
%
% Products Required:
% Statistics Toolbox
%
% Alexandros Leontitsis
% Department of Education
% University of Ioannina
% 45110- Dourouti
% Ioannina
% Greece
%
% University e-mail: me00743@cc.uoi.gr
% Lifetime e-mail: leoaleq@yahoo.com
% Homepage: http://www.geocities.com/CapeCanaveral/Lab/1421
%
% 3 Feb 2002.

for i=1:size(y,2)

% quito NaN:
x=x(:);
yi=y(:,i);

H=find(x>=0 & yi>=0);
x=x(H);
yi=yi(H);

% x and y must have equal number of rows
if size(x,1)~=size(yi,1)
error('x and y must have equal number of rows.');
end

% Find the data length
N = length(x);

% Get the ranks of x
R = crank(x)';

% Get the ranks of y
S = crank(yi)';

% Calculate the correlation coefficient
r(i) = 1-6*sum((R-S).^2)/N/(N^2-1);

end

% Calculate the t statistic
if r == 1 | r == -1
t = r*inf;
else
t=r.*sqrt((N-2)./(1-r.^2));
end

% Calculate the p-values
p=2*(1-tcdf(abs(t),N-2));

function r=crank(x)
%Syntax: r=crank(x)
%__________________
%
% Assigns ranks on a data series x.
%
% r is the vector of the ranks
% x is the data series. It must be sorted.
%
%
% Reference:
% Press W. H., Teukolsky S. A., Vetterling W. T., Flannery B. P.(1996):
% Numerical Recipes in C, Cambridge University Press. Page 642.
%
%
% Alexandros Leontitsis
% Department of Education
% University of Ioannina
% 45110- Dourouti
% Ioannina
% Greece
%
% University e-mail: me00743@cc.uoi.gr
% Lifetime e-mail: leoaleq@yahoo.com
% Homepage: http://www.geocities.com/CapeCanaveral/Lab/1421
%
% 3 Feb 2002.

u = unique(x);
[xs,z1] = sort(x);
[z1,z2] = sort(z1);
r = (1:length(x))';
r=r(z2);

for i=1:length(u)

s=find(u(i)==x);

r(s,1) = mean(r(s));

end
***********************************

10 Oct 2006 Sunnia Chai

It is a helpful and actual-working-fine program. I ran the SPSS cp the result, and it is matched. for tcdf : google it, you will find a shared code. Just download it and add it in your work folder. For input example: x = [1 2 3]';y = [8 2 7]';spear(x,y);

13 Sep 2006 Kyaw Tun

I found this error

??? Undefined command/function 'tcdf'.

Error in ==> spear at 91
p=2*(1-tcdf(abs(t),N-2));

11 Apr 2006 Joost de Winter

Very helpful. Too bad that the function did not automatically handle rows instead of columns, but this was easily corrected.

08 Mar 2006 Benjamin Levy

A small point: the normal approximation for the probability values is valid for n>30 (Conover, Practical Non-parametric Statistics, 1980, Table A10). I would suggest alerting the reader/user to this in returning the p value. A modification of the code to use the rank-based critical values for Spearman's rho would be trivial (I'm going to do for myself, anyway). For Linda S., the difference in Pearson's r (corr function) and Spearman's rho diminishes as sample size grows so that the discrete ranks-based distribution becomes more and more like a continuous pdf. Try it with synthetic data for yourself.

23 Sep 2005 Peter Li

Nice implementation straight from Numerical Recipes. A simpler approach with the same result:

function [rho, rhop, rholo, rhoup] = spear(X)
% X is a matrix whose columns are variables
% and whose rows are observations

% Generate the ranks matrix Z for each column
% of X. Tying ranks are averaged as is
% standard.
Y = sort(X);
for col = 1:size(X,2)
for row = 1:size(X,1)
Z(X(:,col) == X(row,col),col) = mean(find(Y(:,col) == X(row,col)));
end
end

[rho, rhop, rholo, rhoup] = corrcoef(Z);

07 Jun 2005 Linda S.

The built-in Matlab function corr() can calculate Spearman's rho like so:

[rho, pvalue] = corr(x, y, 'type', 'spearman')

What's a bit confusing is that p-values produced by spear() are different from those from corr():

>> x = [1 2 3 4 5 ]';
>> y = [8 6 7 5 4]';
>> [rho, pvalue] = corr(x, y, 'type', 'spearman')
rho =
-0.9000
pvalue =
0.0833
>>
>>
>> [r, t, p] = spear(x,y)
r =
-0.9000
t =
-3.5762
p =
0.0374

Any idea what's going on?

17 Jan 2005 buffalo fbi

xp

08 Oct 2004 someguy someguy

please ignore my comments. i was drunk!

08 Oct 2004 someguy someguy

err..this is not spearman but pearson cc

19 Sep 2004 S Nishimoto

Thanks. Note that the script needs Statistic toolbox
as it uses a function "tcdf".
With the toolbox, it can work on Release 13.

18 Sep 2004 Dan S.

I got this error using Matlab release 13:

EDU>> spear A B
Warning: Divide by zero.
(Type "warning off MATLAB:divideByZero" to suppress this warning.)
> In C:\matlab_sv13\work\spear.m at line 54
??? Undefined function or variable 'tcdf'.

Error in ==> C:\matlab_sv13\work\spear.m
On line 68 ==> p=2*(1-tcdf(abs(t),N-2));

maybe "tcdf" is a Matlab release 14 script/function?

15 Sep 2004 Xiaoyang Tan

Thanks

15 Aug 2004 M Matsuhashi

Nice to use, but p significance value is incorrect for small N (<10)

12 Apr 2004 Jonathan D. Nelson

note that input matrixes should be n-by-1 for the list one is comparing with, and n-by-x where there are x rank correlations one wishes to compute. outputs are [correlation, t-value, p-value].

03 Apr 2004 Brad Vines

Glad to find it.

17 Feb 2004 Xiang Shi-Ming

It is the excellent one

21 Jan 2004 wood ma

very useful

Updates
06 Jul 2004

Spelling corrections. Thanks to Phillip Feldman!

Contact us