MATLAB Answers

Isti
0

How to replace some of the value in the matrix with NaN?

Asked by Isti
on 21 Apr 2012
The simple case is like this:
2 1 4 6 2
9 4 6 1 2
5 3 2 8 3
7 2 1 9 3
7 1 8 2 4
From the matrix above, i want to insert 3 NaNs in random place. So, my code is like this:
Data = [2,1,4,6,2;9,4,6,1,2;5,3,2,8,3;7,2,1,9,3;7,1,8,2,4];
[rows,cols] = size(Data);
p = 3; %amount of NaN that will we inserted
r = randperm(25); %give the random value from range 1-25
r = r(1:3); %give 3 random number from range 1-25
i = 1;a = 1; b = 1;
while i <= 3 %generate every number in vektor r to be position where NaN is located
n = r(a,b);
b = b+1;
e = 1;
if n <= cols
Data(1,n) = NaN;
else
if n > cols
while n > cols
e = e+1;
k = n - cols;
n = k;
end
Data(e,n) = NaN;
end
end
i = i+1;
end
The output one of the output will be like this:
2 1 4 6 2
9 NaN NaN NaN 2
5 3 2 8 3
7 2 1 9 3
7 1 8 2 4
So, i want to make some constraint such as:
1. every row only can have 2 NaN
2. amount NaN in column 1 have to be less then column 2, and amount NaN in column 2 have to be less then column 3, and so on. eg. output matrix will be like this:
2 1 4 6 2
9 4 6 1 NaN
5 3 2 8 3
7 2 1 NaN 3
7 1 8 2 NaN
for matrix above we can see that:
amount NaN of column 1= 0, column 2=0, column 3=0, column 4=1, column 5= 2.
Somebody can help me to insert those my constraint into my code above? Or there willl be another solution i think.
Thanks before :')

  2 Comments

Are you aware of the function, [I,J] = ind2sub(siz,IND)?
no i don't. actually i'm new in using matlab :(
could you help me more about that? or somehow it'll help me in my problem.

Sign in to comment.

3 Answers

Answer by per isakson
on 22 Apr 2012

This is an idea that I have not tested!
jj = 0;
for ii = r
[rr,cc] = ind2sub( size(Data), ii )
if sum(isnan(Data(rr,:))>=2 || sum( isnan(Data(:,cc))>=2
% do nothing
else
Data(rr,cc)=nan;
jj = jj + 1;
if jj = 3, break
end
end
end
--- EDIT ---
The function below will return a result. The constraint is "no more than two NaN in any column or row. However, that was not what you asked for.
function Data = cssm
Data = [2,1,4,6,2;9,4,6,1,2;5,3,2,8,3;7,2,1,9,3;7,1,8,2,4];
p = 3; %amount of NaN that will we inserted
row_vector = randperm(numel(Data));
jj = 0;
for ii = row_vector
[rr,cc] = ind2sub( size(Data), ii );
if sum(isnan(Data(rr,:)))>=2 || sum( isnan(Data(:,cc)))>=2
% do nothing
else
Data(rr,cc)=nan;
jj = jj + 1;
if jj == p, break
end
end
end
end
With the constraint, "amount NaN in column 1 have to be less then column 2, and amount NaN in column 2 have to be less then column 3, and so on.", there is no solution. Do you exclude columns with zero NaN from that constraint?
Thus, (according to my reading) the last column can have two or three NaN and the second last column one or zero NaN. NaN cannot not appear in the other columns.

  4 Comments

Show 1 older comment
what constraint will be broken? i can't get it. i think something wrong in my perception right now.
thanks for help anyway :)
ooh, i think your suggestion code isn't fulfill my second constraint :(
of course not, the columns with zero NaN also included. and so when the column have zero NaN, it will in the very left column of the matrix.
btw, what's the used of ind2sub above. i can't get it yet

Sign in to comment.


Answer by Richard Brown on 22 Apr 2012

This is another one of these problems where the simplest way to solve it is to randomly generate candidates until you find one that fits:
A = reshape(randperm(25), 5, 5);
done = false;
while ~done
idx = randperm(25, 3);
[I, J] = ind2sub([5 5], idx);
m = hist(I, unique(I));
n = hist(J, unique(J));
done = all(m <= 2) && all(diff(n) >= 0);
end
A(idx) = NaN;
It's trivial (but a little messier) to make it more general, so I'll leave you to do that if you need to.
EDIT changed code to use randperm instead of randi - only one call to the random number generator is necessary

  1 Comment

thanks for this answer. actually it works in my smal dataset. but, for my medium dataset (such 1500rows*11columns of data) and more amount of NaN to be insert, it takes very long time. and even i decided to cancel it :(
if i cut the 2nd constraint and only want to use the 1st constraint, is there any way to make it faster?
thanks before.

Sign in to comment.


Answer by Richard Brown on 29 Apr 2012

Here's a much faster method that satisfies both of your constraints. It may be possible to vectorise the loop, but it is, in my opinion, not worth the effort.
First, generate the data
X = rand(1500, 11);
[m,n] = size(X);
nNans = 2000;
We figure out the row and column indices separately. Rows is easy, a single call to randperm does the trick
I = mod(randperm(2*m, nNans), m) + 1;
Then figure out the column positions randomly, going row by row to avoid creating duplicate entries.
J = zeros(1, nNans);
k = 1;
for i = 1:m
idx = (I == i);
J(idx) = randperm(n, nnz(idx));
end
We now need to make sure the columns are ordered correctly. So we construct a logical matrix encoding the position of the NaN entries, and reorder the columns to satisfy your column constraint.
iNan = false(m, n);
iNan(sub2ind([m n], I, J)) = true;
[~, iSorted] = sort(hist(J, 1:n));
iNan = iNan(:, iSorted);
We now have a logical array with the right properties. Last step is to overwrite the entries of X
X(iNan) = nan;

  0 Comments

Sign in to comment.