# How can i randomly divide a dataset(matrix) into k parts ??

49 views (last 30 days)
Mariem Harmassi on 9 Sep 2012
I have a database and i want to randomly divide it into ka parts of equal size . if the database has n row each part will contain n/k randomly chosen row from the dataset .

Mariem Harmassi on 11 Sep 2012
Edited: Oleg Komarov on 11 Sep 2012
function [idxo prtA]=randDivide(M,K)
[n,m]=size(M);
np=(n-rem(n,K))/K;
B=M;
[c,idx]=sort(rand(n,1));
C=M(idx,:);
i=1;
j=1;
ptrA={};
idxo={};
n-mod(n,K)
while i<n-mod(n,K)
prtA{j}=C(i:i+np-1,:);
idxo{i}=idx(i:i+np-1,1);
i=i+np;
j=j+1;
end
prtA{j}=C(i:n,:);
end
this a my algo it works very well think u for ur answers

Oleg Komarov on 9 Sep 2012
Edited: Oleg Komarov on 9 Sep 2012
Suppose you have the N by M matrix A. I would randomly permute positions from 1:N and then group them into k partitions. Follows the code.
% Sample inputs
N = 100;
A = rand(N,2);
% Number of partitions
k = 6;
% Scatter row positions
pos = randperm(N);
% Bin the positions into k partitions
edges = round(linspace(1,N+1,k+1));
Now you can "physically" partition A, or apply your code to the segments of without actually separating into blocks.
% Partition A
prtA = cell(k,1);
for ii = 1:k
idx = edges(ii):edges(ii+1)-1;
prtA{ii} = A(pos(idx),:); % or apply code to the selection of A
end
EDIT
You can also avoid the loop, but in that case you have to build a group index that points the row to which partition it belongs and then apply accumarray() to execute your code on the partitions.

Show 1 older comment
Oleg Komarov on 10 Sep 2012
Why does it matter?
Anyways, after the loop:
% Index smaller as last
[~,idx] = sort(diff(edges),'descend');
prtA = prtA(idx);
Mariem Harmassi on 10 Sep 2012
Yes i tried it but prtA =
[17x2 double]
[17x2 double]
[17x2 double]
[17x2 double]
[16x2 double]
[16x2 double]
and what i want is all the result that i expect is
prtA =
[17x2 double]
[17x2 double]
[17x2 double]
[17x2 double]
[17x2 double]
[15x2 double]
all the partition with the same size and the rest in the last partition
HOw to do that
Oleg Komarov on 11 Sep 2012
Adapting to your requests, I build edges in a slightly different way then:
% Sample inputs scrambling
N = 100;
A = rand(N,2);
k = 6;
pos = randperm(N);
% Edges
edges = 1:round(N/k):N+1;
if numel(edges) < k+1
edges = [edges N+1];
end
% partition
prtA = cell(k,1);
for ii = 1:k
idx = edges(ii):edges(ii+1)-1;
prtA{ii} = A(pos(idx),:);
end

Azzi Abdelmalek on 9 Sep 2012
Edited: Azzi Abdelmalek on 10 Sep 2012
A=rand(210,4);[n,m]=size(A);
np=20;B=A;
[c,idx]=sort(rand(n,1));
C=A(idx,:);
idnan=mod(np-rem(n,np),np)
C=[C ;nan(idnan,m)];
[n,m]=size(C);
for k=1:n/np
ind=(k-1)*np+1:k*np
res(:,:,k)=C(ind,:)
end
idxo=reshape([idx ;nan(idnan,1)],np,1,n/np) % your original index