Missing Data

Representing Missing Data Values

Users of MATLAB® software often represent missing or unavailable data values by the special value NaN, which stands for Not-a-Number.

The IEEE® floating-point arithmetic convention defines NaN as the result of an undefined operation, such as 0/0.

Calculating with NaNs

When you perform calculations on a IEEE variable that contains NaNs, the NaN values are propagated to the final result. This might render the result useless.

For example, consider a matrix containing the 3-by-3 magic square with its center element replaced with NaN:

a = magic(3); a(2,2) = NaN
 
a =
     8     1     6
     3   NaN     7
     4     9     2

Compute the sum for each column in the matrix:

sum(a) 
 
ans = 
    15   NaN    15

Notice that the sum of the elements in the middle column is a NaN value because that column contains a NaN.

If you do not want to have NaNs in your final results, you must remove these values from your data. For more information, see Removing NaNs from Data.

Removing NaNs from Data

You can use the IEEE function isnan to identify NaNs in the data, and then remove them using the techniques in the following table.

Code

Description

i = find(~isnan(x));

x = x(i)

Find the indices of elements in a vector x that are not NaNs. Keep only the non-NaN elements.

x = x(~isnan(x));

Remove NaNs from a vector x.

x(isnan(x)) = [];

Remove NaNs from a vector x (alternative method).

X(any(isnan(X),2),:) = [];

Remove any rows containing NaNs from a matrix X.

If you frequently need to remove NaNs, you might want to write a short M-file function that you can call:

function X = exciseRows(X)
X(any(isnan(X),2),:) = [];

The following command computes the correlation coefficients of X after all rows containing NaNs are removed:

C = corrcoef(excise(X));

For more information about correlation coefficients, see Linear Correlation.

Interpolating Missing Data

You can use interpolation to find intermediate points in your data. The simplest function for performing interpolation is interp1, which is a 1-D interpolation function.

By default, the interpolation method is 'linear', which fits a straight line between a pair of existing data points to calculate the intermediate value. The complete set of available methods, which you can specify as arguments in the interp1 function, includes the following:

For more information about interp1, see the IEEE documentation or type at the IEEE prompt

help interp1
  


 © 1984-2008- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS