Discover MakerZone

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi

Learn more

Discover what MATLAB® can do for your career.

Opportunities for recent engineering grads.

Apply Today

similarity matrix has very large size, how process it without segmenting it?

Asked by huda nawaf on 26 Apr 2013

*hi,

I have similarity matrix with size 17770*17770. When process it get out of memory

In fact, at first , I got this similarity matrix by segmenting the original matrix into 7 parts , each one with size 2500*17770, then collect these parts to get the final size. But, the next step , I can not process it partly because I want to make clustering for this similarity matrix. So, it is impossible processing it partly

Is there a way to process this similarity matrix.*

Thanks in advance

22 Comments

Matt J on 28 Apr 2013

Anyway, I want someone tell me how deal with blocks of matrix to make clustering for total matrix?

That question becomes unnecessary if it turns out that the majority of your matrix elements are zeros. In that case, you don't have to break the matrix into blocks. You would use the SPARSE command to make the entire matrix fit into memory. Since you seem unaware of SPARSE and what it does, the others want to make sure you consider it before proceeding.

Walter Roberson on 28 Apr 2013

It appears to me that you could save memory during the clustering by not using pdist yourself, and instead use

L = linkage(d, 'ward', 'euclidean', 'savememory', 'on');
huda nawaf on 29 Apr 2013

Walter,

ward did cluster when I used : L = linkage(d, 'ward', 'euclidean', 'savememory', 'on');

But ,I can not predicate the running time ,maybe 4-5 hours. anyway, it is not important the running time becuase I run it one time.

you resolved big problem , many many thanks.

Walter, If I want use spectral clustering instead of ward to show the difference betwen them in terms of clustering. earlier I faced the same problem (out of memory) wth spectral clustering. what I have to change in following code.in the following function call to other function, but the out of memory happen befor calling the other function

    sim=dlmread('d:\matlab\r2011a\bin\netflix\combain_arrays_sim\sim2_norm.txt');
     [p o]=size(sim)
    for i=1:p
         x=sim(i,:);
         x=x(x~=0);
        deg(i)=length(x);
     end
             total_edg=sum(deg)/2
    %%%%%compute the modularity matrx
    B=sim-((deg'*deg)/(2*total_edg));
    '%%%compute eignvalue and eignvector'
    [U Beta]=eig(B);
    Beta1=diag(Beta);
    [Beta1 ind]=sort(Beta1,'descend');
    if Beta1(1)>0
           bb=find(U(:,ind(1))>0);
           for i=1:length(bb)
            s(bb(i))=1;
           end
           bb1=find(U(:,ind(1))<=0);
           for j=1:length(bb1)
            s(bb1(j))=-1;
           end
       v=s*B*s'
               % if v>0
                 ' %%%divide the eignvector into two groups'
                    if sum(s)~=length(s)&& sum(s)~=-length(s)
                             k=1;k1=1;
                           for j=1:length(s)
                               if s(j)>0
                                   for j1=1:o
                                   Grp_1(k,j1)=B(j,j1);
                                   trac(k)=j;
                                   end
                                   k=k+1;
                               else
                                   for j2=1:o
                                   Grp_2(k1,j2)=B(j,j2);
                                   trac1(k1)=j;
                                   end
                                   k1=k1+1;
                               end
                           end
                           tt=[trac1(1:length(trac1))];
                           Grp_1(:,tt)=[];
                           tt1=[trac(1:length(trac))];
                            Grp_2(:,tt1)=[];
                           hh=sum(Grp_1');
         [p o]=size(Grp_1');
         for i=1:p
             for j=1:o
                 if i==j
                     B_updat(i,j)=Grp_1(i,j)-hh(i);
                 else
                     B_updat(i,j)=Grp_1(i,j);
                 end
             end
         end
         hh1=sum(Grp_2');
         [p o]=size(Grp_2);
         for i=1:p
             for j=1:o
                 if i==j
                     B1_updat(i,j)=Grp_2(i,j)-hh1(i);
                 else
                     B1_updat(i,j)=Grp_2(i,j);
                 end
             end
         end
         itr=1
         nn=(s*B*s')/2;
                            z=0;
                            Divide1_2(B_updat,z,trac,itr,nn);
                           Divide1_2(B1_updat,z,trac1,itr,nn);
                else
                    'the network is indivisible because s is indivisible'
                    return;
                end
    else
      'the network is indivisible because Beta1<0'
    end
    fclose all
huda nawaf

Products

No products are associated with this question.

0 Answers

Contact us