Process data by marking rows with unknown values
[y,ps] = fixunknowns(X)
[y,ps] = fixunknowns(X,FP)
Y = fixunknowns('apply',X,PS)
X = fixunknowns('reverse',Y,PS)
name = fixunknowns('name')
fp = fixunknowns('pdefaults')
pd = fixunknowns('pdesc')
fixunknowns('pcheck',fp)
fixunknowns
processes matrices by replacing
each row containing unknown values (represented by NaN
)
with two rows of information.
The first row contains the original row, with NaN
values
replaced by the row's mean. The second row contains 1 and 0
values, indicating which values in the first row were known or unknown,
respectively.
[y,ps] = fixunknowns(X)
takes these inputs,
X 

and returns
Y 

PS  Process settings that allow consistent processing of values 
[y,ps] = fixunknowns(X,FP)
takes an empty
struct FP
of parameters.
Y = fixunknowns('apply',X,PS)
returns Y
,
given X
and settings PS
.
X = fixunknowns('reverse',Y,PS)
returns X
,
given Y
and settings PS
.
name = fixunknowns('name')
returns the
name of this process method.
fp = fixunknowns('pdefaults')
returns the
default process parameter structure.
pd = fixunknowns('pdesc')
returns the process
parameter descriptions.
fixunknowns('pcheck',fp)
throws an error
if any parameter is illegal.
Here is how to format a matrix with a mixture of known and unknown values in its second row:
x1 = [1 2 3 4; 4 NaN 6 5; NaN 2 3 NaN] [y1,ps] = fixunknowns(x1)
Next, apply the same processing settings to new values:
x2 = [4 5 3 2; NaN 9 NaN 2; 4 9 5 2] y2 = fixunknowns('apply',x2,ps)
Reverse the processing of y1
to get x1
again.
x1_again = fixunknowns('reverse',y1,ps)
If you have input data with unknown values, you can represent
them with NaN
values. For example, here are five
2element vectors with unknown values in the first element of two
of the vectors:
p1 = [1 NaN 3 2 NaN; 3 1 1 2 4];
The network will not be able to process the NaN
values
properly. Use the function fixunknowns
to transform
each row with NaN
values (in this case only the
first row) into two rows that encode that same information numerically.
[p2,ps] = fixunknowns(p1);
Here is how the first row of values was recoded as two rows.
p2 = 1 2 3 2 2 1 0 1 1 0 3 1 1 2 4
The first new row is the original first row, but with the mean
value for that row (in this case 2
) replacing all NaN
values.
The elements of the second new row are now either 1
,
indicating the original element was a known value, or 0
indicating
that it was unknown. The original second row is now the new third
row. In this way both known and unknown values are encoded numerically
in a way that lets the network be trained and simulated.
Whenever supplying new data to the network, you should transform
the inputs in the same way, using the settings ps
returned
by fixunknowns
when it was used to transform the
training input data.
p2new = fixunknowns('apply',p1new,ps);
The function fixunkowns
is only recommended
for input processing. Unknown targets represented by NaN
values
can be handled directly by the toolbox learning algorithms. For instance,
performance functions used by backpropagation algorithms recognize NaN
values
as unknown or unimportant values.