## similarity matrix has very large size, how process it without segmenting it?

### huda nawaf (view profile)

on 26 Apr 2013

*hi,

I have similarity matrix with size 17770*17770. When process it get out of memory

In fact, at first , I got this similarity matrix by segmenting the original matrix into 7 parts , each one with size 2500*17770, then collect these parts to get the final size. But, the next step , I can not process it partly because I want to make clustering for this similarity matrix. So, it is impossible processing it partly

Is there a way to process this similarity matrix.*

Matt J

### Matt J (view profile)

on 28 Apr 2013

Anyway, I want someone tell me how deal with blocks of matrix to make clustering for total matrix?

That question becomes unnecessary if it turns out that the majority of your matrix elements are zeros. In that case, you don't have to break the matrix into blocks. You would use the SPARSE command to make the entire matrix fit into memory. Since you seem unaware of SPARSE and what it does, the others want to make sure you consider it before proceeding.

Walter Roberson

### Walter Roberson (view profile)

on 28 Apr 2013

It appears to me that you could save memory during the clustering by not using pdist yourself, and instead use

```L = linkage(d, 'ward', 'euclidean', 'savememory', 'on');
```
huda nawaf

### huda nawaf (view profile)

on 29 Apr 2013

Walter,

ward did cluster when I used : L = linkage(d, 'ward', 'euclidean', 'savememory', 'on');

But ,I can not predicate the running time ,maybe 4-5 hours. anyway, it is not important the running time becuase I run it one time.

you resolved big problem , many many thanks.

Walter, If I want use spectral clustering instead of ward to show the difference betwen them in terms of clustering. earlier I faced the same problem (out of memory) wth spectral clustering. what I have to change in following code.in the following function call to other function, but the out of memory happen befor calling the other function

`    sim=dlmread('d:\matlab\r2011a\bin\netflix\combain_arrays_sim\sim2_norm.txt');`
`     [p o]=size(sim)`
```    for i=1:p
x=sim(i,:);
x=x(x~=0);
deg(i)=length(x);```
```     end
total_edg=sum(deg)/2```
```    %%%%%compute the modularity matrx
B=sim-((deg'*deg)/(2*total_edg));```
```    '%%%compute eignvalue and eignvector'
[U Beta]=eig(B);
Beta1=diag(Beta);
[Beta1 ind]=sort(Beta1,'descend');```
`    if Beta1(1)>0`
```           bb=find(U(:,ind(1))>0);
for i=1:length(bb)
s(bb(i))=1;
end
bb1=find(U(:,ind(1))<=0);
for j=1:length(bb1)
s(bb1(j))=-1;
end```
```       v=s*B*s'
% if v>0```
```                 ' %%%divide the eignvector into two groups'
if sum(s)~=length(s)&& sum(s)~=-length(s)```
```                             k=1;k1=1;
for j=1:length(s)
if s(j)>0
for j1=1:o
Grp_1(k,j1)=B(j,j1);
trac(k)=j;
end
k=k+1;
else
for j2=1:o
Grp_2(k1,j2)=B(j,j2);
trac1(k1)=j;
end
k1=k1+1;
end
end
tt=[trac1(1:length(trac1))];
Grp_1(:,tt)=[];
tt1=[trac(1:length(trac))];
Grp_2(:,tt1)=[];```
```                           hh=sum(Grp_1');
[p o]=size(Grp_1');
for i=1:p
for j=1:o
if i==j
B_updat(i,j)=Grp_1(i,j)-hh(i);
else
B_updat(i,j)=Grp_1(i,j);
end
end
end
hh1=sum(Grp_2');
[p o]=size(Grp_2);
for i=1:p
for j=1:o
if i==j
B1_updat(i,j)=Grp_2(i,j)-hh1(i);
else
B1_updat(i,j)=Grp_2(i,j);
end
end
end
itr=1
nn=(s*B*s')/2;```
```                            z=0;
Divide1_2(B_updat,z,trac,itr,nn);
Divide1_2(B1_updat,z,trac1,itr,nn);
else
'the network is indivisible because s is indivisible'
return;
end```
`    else`
```      'the network is indivisible because Beta1<0'
end
fclose all```

## Products

No products are associated with this question.

#### Join the 15-year community celebration.

Play games and win prizes!

MATLAB and Simulink resources for Arduino, LEGO, and Raspberry Pi