Data with Missing Values

Many data sets have one or more missing values. It is convenient to code missing values as NaN (Not a Number) to preserve the structure of data sets across multiple variables and observations.

For example:

X = magic(3);
X([1 5]) = [NaN NaN]
X =

   NaN     1     6
     3   NaN     7
     4     9     2

Normal MATLAB® arithmetic operations yield NaN values when operands are NaN:

s1 = sum(X)
s1 =

   NaN   NaN    15

Removing the NaN values would destroy the matrix structure. Removing the rows containing the NaN values would discard data. Statistics Toolbox™ functions in the following table remove NaN values only for the purposes of computation.

FunctionDescription
nancov

Covariance matrix, ignoring NaN values

nanmax

Maximum, ignoring NaN values

nanmean

Mean, ignoring NaN values

nanmedian

Median, ignoring NaN values

nanmin

Minimum, ignoring NaN values

nanstd

Standard deviation, ignoring NaN values

nansum

Sum, ignoring NaN values

nanvar

Variance, ignoring NaN values

For example:

s2 = nansum(X)
s2 =

     7    10    15

Other Statistics Toolbox functions also ignore NaN values. These include iqr, kurtosis, mad, prctile, range, skewness, and trimmean.

Was this topic helpful?