File Exchange

image thumbnail

Spearman Rank Correlation

version 1.0 (2.31 KB) by

It calculates the Spearman rank correlation coefficient.

28 Downloads

Updated

No License

It calculates the Spearman rank correlation coefficient from 2 or more data sets, and the associated t-test and p-values. The code is adapted with major changes from the Numerical Recipes book (http://www.nr.com/)

Example:
>> x = [1 2 3 3 3]';
>> y = [1 2 2 4 3; rand(1,5)]';
>> [r,t,p] = spear(x,y)

>> [r,t,p]=spear(x,y)

r =

    0.8250 -0.6000

t =

    2.5285 -1.2990

p =

    0.0855 0.2848

Comments and Ratings (28)

Minwei CHEN

it works

Daniel

Daniel (view profile)

If you want to compute tied ranks along a specific dimension of an ND-array see http://www.mathworks.com/matlabcentral/fileexchange/34560-tiedrankxdim.

Richard Crozier

This might be useful if it did not require the stats toolbox which has it's own functions for calculating this.

Ethan Meyers

This version of Spearman's correlation gives incorrect results if there are tied values (which is very likely in many applications). It is much better to use Matlab's Spearman's correlation function as follows corr(X, 'type', Spearman').

Josue Alvarez

I would like to calculate the spearman correlation of two matrix, how can I?

Elisabeth Larsson

Warning if you have missing values! This program gave significant correlation with a column of NaNs!

Lana cfai

im using matlab2006a, when i type in the coding it appear
"Undefined command/function 'spear'"
can i know why?
please advise..

Robert Schleicher

I cross-checked the results with SPSS 14 on a trial basis and got slightly different values. That's because this version doesn't account for tied ranks (which occured in my test), I suppose.

Priscilla Mabhande

Good reading, very informative

Stephen Cowen

tdcf is a function in the matlab Statistics Toolbox

Sergio Vallina

It does not handle NAN values. See corrected *.m file below (it handle NAN):

*************************************
function [r,t,p]=spear(x,y)
%Syntax: [r,t,p]=spear(x,y)
%__________________________
%
% Spearman's rank correalation coefficient.
%
% r is the Spearman's rank correlation coefficient.
% t is the t-ratio of r.
% p is the corresponding p-value.
% x is the first data series (column).
% y is the second data series, a matrix which may contain one or multiple
% columns.
%
%
% Reference:
% Press W. H., Teukolsky S. A., Vetterling W. T., Flannery B. P.(1996):
% Numerical Recipes in C, Cambridge University Press. Page 641.
%
%
% Example:
% x = [1 2 3 3 3]';
% y = [1 2 2 4 3; rand(1,5)]';
% [r,t,p] = spear(x,y)
%
%
% Products Required:
% Statistics Toolbox
%
% Alexandros Leontitsis
% Department of Education
% University of Ioannina
% 45110- Dourouti
% Ioannina
% Greece
%
% University e-mail: me00743@cc.uoi.gr
% Lifetime e-mail: leoaleq@yahoo.com
% Homepage: http://www.geocities.com/CapeCanaveral/Lab/1421
%
% 3 Feb 2002.

for i=1:size(y,2)
    
    % quito NaN:
    x=x(:);
    yi=y(:,i);
    
    H=find(x>=0 & yi>=0);
    x=x(H);
    yi=yi(H);
    
    % x and y must have equal number of rows
    if size(x,1)~=size(yi,1)
error('x and y must have equal number of rows.');
    end
    
    % Find the data length
    N = length(x);
    
    % Get the ranks of x
    R = crank(x)';

    % Get the ranks of y
    S = crank(yi)';
    
    % Calculate the correlation coefficient
    r(i) = 1-6*sum((R-S).^2)/N/(N^2-1);
    
end

% Calculate the t statistic
if r == 1 | r == -1
    t = r*inf;
else
    t=r.*sqrt((N-2)./(1-r.^2));
end

% Calculate the p-values
p=2*(1-tcdf(abs(t),N-2));

function r=crank(x)
%Syntax: r=crank(x)
%__________________
%
% Assigns ranks on a data series x.
%
% r is the vector of the ranks
% x is the data series. It must be sorted.
%
%
% Reference:
% Press W. H., Teukolsky S. A., Vetterling W. T., Flannery B. P.(1996):
% Numerical Recipes in C, Cambridge University Press. Page 642.
%
%
% Alexandros Leontitsis
% Department of Education
% University of Ioannina
% 45110- Dourouti
% Ioannina
% Greece
%
% University e-mail: me00743@cc.uoi.gr
% Lifetime e-mail: leoaleq@yahoo.com
% Homepage: http://www.geocities.com/CapeCanaveral/Lab/1421
%
% 3 Feb 2002.

u = unique(x);
[xs,z1] = sort(x);
[z1,z2] = sort(z1);
r = (1:length(x))';
r=r(z2);

for i=1:length(u)
    
    s=find(u(i)==x);
    
    r(s,1) = mean(r(s));
    
end
***********************************

Sunnia Chai

It is a helpful and actual-working-fine program. I ran the SPSS cp the result, and it is matched. for tcdf : google it, you will find a shared code. Just download it and add it in your work folder. For input example: x = [1 2 3]';y = [8 2 7]';spear(x,y);

Kyaw Tun

I found this error

??? Undefined command/function 'tcdf'.

Error in ==> spear at 91
p=2*(1-tcdf(abs(t),N-2));

Joost de Winter

Very helpful. Too bad that the function did not automatically handle rows instead of columns, but this was easily corrected.

Benjamin Levy

A small point: the normal approximation for the probability values is valid for n>30 (Conover, Practical Non-parametric Statistics, 1980, Table A10). I would suggest alerting the reader/user to this in returning the p value. A modification of the code to use the rank-based critical values for Spearman's rho would be trivial (I'm going to do for myself, anyway). For Linda S., the difference in Pearson's r (corr function) and Spearman's rho diminishes as sample size grows so that the discrete ranks-based distribution becomes more and more like a continuous pdf. Try it with synthetic data for yourself.

Peter Li

Nice implementation straight from Numerical Recipes. A simpler approach with the same result:

function [rho, rhop, rholo, rhoup] = spear(X)
% X is a matrix whose columns are variables
% and whose rows are observations

% Generate the ranks matrix Z for each column
% of X. Tying ranks are averaged as is
% standard.
Y = sort(X);
for col = 1:size(X,2)
    for row = 1:size(X,1)
        Z(X(:,col) == X(row,col),col) = mean(find(Y(:,col) == X(row,col)));
    end
end

[rho, rhop, rholo, rhoup] = corrcoef(Z);

Linda S.

The built-in Matlab function corr() can calculate Spearman's rho like so:

[rho, pvalue] = corr(x, y, 'type', 'spearman')

What's a bit confusing is that p-values produced by spear() are different from those from corr():

>> x = [1 2 3 4 5 ]';
>> y = [8 6 7 5 4]';
>> [rho, pvalue] = corr(x, y, 'type', 'spearman')
rho =
   -0.9000
pvalue =
    0.0833
>>
>>
>> [r, t, p] = spear(x,y)
r =
   -0.9000
t =
   -3.5762
p =
    0.0374

Any idea what's going on?

buffalo fbi

xp

someguy someguy

please ignore my comments. i was drunk!

someguy someguy

err..this is not spearman but pearson cc

S Nishimoto

Thanks. Note that the script needs Statistic toolbox
as it uses a function "tcdf".
With the toolbox, it can work on Release 13.

Dan S.

I got this error using Matlab release 13:

EDU>> spear A B
Warning: Divide by zero.
(Type "warning off MATLAB:divideByZero" to suppress this warning.)
> In C:\matlab_sv13\work\spear.m at line 54
??? Undefined function or variable 'tcdf'.

Error in ==> C:\matlab_sv13\work\spear.m
On line 68 ==> p=2*(1-tcdf(abs(t),N-2));

maybe "tcdf" is a Matlab release 14 script/function?

Xiaoyang Tan

Thanks

M Matsuhashi

Nice to use, but p significance value is incorrect for small N (<10)

Jonathan D. Nelson

note that input matrixes should be n-by-1 for the list one is comparing with, and n-by-x where there are x rank correlations one wishes to compute. outputs are [correlation, t-value, p-value].

Brad Vines

Glad to find it.

Xiang Shi-Ming

It is the excellent one

wood ma

very useful

Updates

Spelling corrections. Thanks to Phillip Feldman!

MATLAB Release
MATLAB 7 (R14)

Download apps, toolboxes, and other File Exchange content using Add-On Explorer in MATLAB.

» Watch video