Double ranking selection - distances and standard deviation
Show older comments
I have a matrix x (250x10 double), this are normalized prices. The columns are 10 assets and the rows represent the time The first 5 rows (of the 250 rows) looks as follow:
-0.730 -0.859 -1.490 0.505 1.546 1.376 -0.343 2.075 2.349 1.104
-0.510 -0.830 -1.346 0.392 1.510 1.135 -0.343 2.301 2.267 1.487
-0.344 -0.844 -1.225 0.335 1.528 0.813 -0.496 2.326 2.145 1.487
-0.694 -0.931 -1.153 0.221 1.473 0.813 -0.343 2.175 1.818 1.334
-0.142 -0.888 -1.104 0.221 1.510 1.054 -0.232 2.175 1.370 1.334
I want to calculate the distance for each row between all possible columns (assets). So I created a matrix Qdist (10x10) for all possible pairs. After that I create to matrices where I determine the smallest pair for each column(asset) [m] and the related pair column(asset) [n]. In the last step I rank the the pairs from 1 to 10, these ranks I want to use in the next step.
[n2]=size(x,2);
Qdist=zeros(n2,n2);
for i=1:n2
for j=1:i-1
Qdist(i,j)=sum((x(:,i)-x(:,j)).^2);
Qdist(j,i)=Qdist(i,j);
end
Qdist(i,i)=nan;
end
[m,n]=min(Qdist);
pairs(:,1)=n';
[~,r]=sort(m);
[~,xRanks]=sort(r); %rank all stocks
After ranking the pairs from 1 to 10,I want to select only the 8 best pairs (Q1). With these 8 pairs I want to determine the standard deviation of each column of matrix x2 (just the prices, and not normalized). So all pairs have two different standard deviations. This means that I get 16 standard deviations (Q2).
Matrix x2 looks as follow (first 5 rows)
11.41 5.2 1.12 4.12 4.68 1.61 8.23 6.06 9.16 2.93
11.53 5.22 1.18 4 4.66 1.58 8.23 6.15 9.14 3.03
11.62 5.21 1.23 3.94 4.67 1.54 8.12 6.16 9.11 3.03
11.43 5.15 1.26 3.82 4.64 1.54 8.23 6.1 9.03 2.99
11.73 5.18 1.28 3.82 4.66 1.57 8.31 6.1 8.92 2.99
Next I want to select 4 pairs that is ranked at least in the top 4 highest standard deviations (Q3). So this means that I have to rank twice. (first on distance, second in st. dev.)
What are you recommendations for solving Q1,Q2,Q3?
I already came this far (see code), but the last part seems to be toughest part.
I hope it is clear to you all and I am eager to hear from you!
1 Comment
Julian
on 1 Oct 2014
Although this relates to the first part of your problem, which I believe you have solved, the old FEX contribution distance.m is handy in this context....
Answers (0)
Categories
Find more on Descriptive Statistics and Insights in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!