version 1.2.0.0 (3.14 KB) by
Liber Eleutherios

Fast Weighted Kendall Rank Correlation Matrix (much much faster than Matlab CORR)

TAU = KENDALLTAU(Y) returns an N-by-N matrix containing the pairwise Kendall rank correlation coefficient between each pair of columns in the T-by-N matrix Y. The coefficients are adjusted for ties (it's the so called "tau-b"). Kendall's tau-b is identical to the standard tau (or tau-a) when there are no ties.

TAU = KENDALLTAU(Y, w) returns the Weighted Kendall Rank Correlation Matrix, where w is a [T * (T - 1) / 2]-by-1 vector of weights for all combinations of comparisons between observations i and j.

Reference: F. Pozzi, T. Di Matteo, T. Aste, "Exponential smoothing weighted correlations", The European Physical Journal B, Volume 85, Issue 6, 2012. DOI: 10.1140/epjb/e2012-20697-x

This algorithm, potentially MUCH MUCH faster than Matlab CORR function (seconds vs hours), has been thought for small datasets: a contiguous block of your machine's virtual memory is needed, in order to store a matrix of dimensions [T * (T - 1) / 2]-by-N

The basic idea is that Kendall tau is nothing else or more than a linear correlation of all pairwise signs between variables.

Notice that no NaN or Inf value is allowed in Y: please clean your data before using KENDALLTAU; also, this function doesn't calculate p-values (but the implementation should be relatively simple).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

EXAMPLE1: How to use this function (on my very limited laptop)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%

% N = 300;

% T = 100;

% Y = randint(T, N, [0, 100]); % Lots of ties

% tic, tau1 = kendalltau(Y); toc % Try this function

%

% % ---> <---

% % ---> Elapsed time is 0.577000 seconds. <---

% % ---> <---

%

% tic, tau2 = corr(Y, 'Type', 'kendall'); toc % Try CORR

%

% % ---> <---

% % ---> Elapsed time is 132.241000 seconds. <---

% % ---> <---

%

% plot(tau1(:) - tau2(:), '.')

% set(gca, 'YLim', [-1e-12, 1e-12]); % exactly same results

%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

EXAMPLE2: How to use this function (on a decent computer, fast and with a big memory available).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%

% N = 1000; % 10/3 times bigger than before

% T = 1000; % 10 times bigger than before

% Y = randint(T, N, [0, 100]); % Lots of ties

% tic, tau1 = kendalltau(Y); toc % Try this function

%

% % ---> <---

% % ---> Elapsed time is 48.826421 seconds. <---

% % ---> <---

%

% tic, tau2 = corr(Y, 'Type', 'kendall'); toc % Try CORR

%

% % ---> <---

% % ---> Elapsed time is 13398.811714 seconds. <---

% % ---> <---

%

% temp = tau1(:) - tau2(:);

% temp = hist(temp);

% temp % exactly same results

% % 0 0 0 0 0 1000000 0 0 0 0

%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

EXAMPLE3: Weighted Kendall Rank Correlation Matrix

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%

% N = 100; % Number of variables

% T = 200; % Number of observations

% Y = randint(T, N, [0, 100]); % Lots of ties

%

% % Weights with exponential smoothing

% alpha = 3 / T;

% w0 = ((exp(alpha) + 1) * (exp(alpha) - 1) ^ 2) / exp(2 * alpha) / (1 - exp(-alpha * T)) / (1 - exp(-alpha * (T - 1)));

% % Prepare indexes for all combinations without repetition

% k = 1;

% for i = 1:(T - 1)

% i1(k:(k + T - i - 1)) = repmat(i, 1, (T - i));

% i2(k:(k + T - i - 1)) = ((i + 1):T);

% k = k + T - i;

% end

% w = w0 * exp(alpha * (i1 + i2 - 2 * T));

%

% tic, tau1 = kendalltau(Y, w); toc

% tic, tau2 = kendalltau(Y); toc

%

% plot(tau2(:), tau1(:), '.') % Compare Weighted vs

% % non-Weighted Kendall Rank

% % Correlation Matrices

%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%

% See also CORRCOEF, CORR.

%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Liber Eleutherios (2021). Weighted Kendall Rank Correlation Matrix (https://www.mathworks.com/matlabcentral/fileexchange/27361-weighted-kendall-rank-correlation-matrix), MATLAB Central File Exchange. Retrieved .

Created with
R14

Compatible with any release

**Inspired by:**
Weighted Correlation Matrix

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!Create scripts with code, output, and formatted text in a single executable document.

JohannesBuckGreat work - the original version for Kendall in Matlab really has huge performance problems in most setting - I'm actually wondering why you don't have more downloads for such a great programme...

AdamaArvind IyerOops...please ignore the earlier comment. It was meant for another file.

Liber EleutheriosJust one note. Suppose you have only two huge vectors:

T = 100000;

x = randn(T, 1);

y = randn(T, 1);

Then no, don't use my function, use CORR instead. In fact, my function will try to create a matrix of T * (T - 1) / 2 = 4.99995 billions of rows (practically impossible).

On the contrary, in this case CORR has a very good performance:

tic, z = corr(x, y, 'type', 'kendall'); toc

Elapsed time is 418.745000 seconds.